Visualize your 日本語 progress in Duolingo everyday. Powered with Python, Poetry, GitHub Actions, and GitHub Pages (HTML, CSS, JS, Bootstrap).
Currently, the automation is done and fully tested, and the website is live on GitHub Pages. Please see the link in the repository to take a look!
Everyday, I practice 日本語 on Duolingo.
One feature that I find the lack of it disturbing in Duolingo is a feature to visualize a graph/plot to trace your language learning progression. As with other skills, language requires you to constantly train and practice everyday to ensure that your skills are always at the bleeding edge. Essentially, by having that feature, we could trace our language learning progression everyday, which I believe is good for several reasons. Gaining a constant amount of Duolingo's experience points everyday would allow you to give an idea about how continuous your learning progression is. We trace several points to be visualized during the usage of the application. All of these metrics will be plotted to a chart to help you visualize your progress.
Mainly, this project is inspired by:
Thank you for the inspiration!
Metrics that are visualized in this website:
Please note that all of these metrics will be updated and synchronized daily and strives to show an accurate representation of your progress.
Experience points gain, time spent in Duolingo, and number of sessions will be synchronized with the API (your real data) with each run, so you would have no worries about the accuracy of your progress.
This system is composed of three components: Fetcher
, Visualizer
, and Automation
. The following graph shows the representation of the system architecture and its components as a whole.
---
title: "Duolingo Visualizer System Architecture"
---
flowchart TD
DuolingoAPI[Duolingo API]
DuolingoDatabase[("Duolingo Database")]
FetcherScript[Python Fetcher]
FetcherDatabase[("JSON File / GitHub Repository")]
VisualizerSite[Static Website]
AutomationRunner[GitHub Action Runner]
AutomationPages[GitHub Pages]
subgraph Duolingo
DuolingoDatabase --> |"Get data"| DuolingoAPI
end
subgraph Fetcher
DuolingoAPI --> |"Fetch data from Duolingo API"| FetcherScript
FetcherScript --> |"Store and commit to the data-store"| FetcherDatabase
end
subgraph Visualizer
FetcherDatabase --> |"Use data"| VisualizerSite
VisualizerSite --> |"Deploy to"| AutomationPages
end
subgraph Automation
AutomationRunner --> |"Run and sync data everyday"| FetcherScript
AutomationRunner --> |"Build and deploy everyday"| VisualizerSite
end
This right and recommended usage of this is 'you should never have to run this script manually, except in some rare circumstances'. The way to use this repository is as follows:
Inspect
-> Application
-> Cookies
-> Copy the jwt_token
entry.Settings
-> Secrets
-> Actions
): DUOLINGO_USERNAME
, DUOLINGO_PASSWORD
, DUOLINGO_JWT
.GIT_AUTHOR_EMAIL
, and GIT_AUTHOR_NAME
(equivalent when you're setting up Git: git config --global user.email ...
).data/duolingo-progress.json
file manually, leaving only []
(an empty array) in that file.Please note that sometimes the cron scheduler may delay because of some unforeseen circumstances at GitHub's side. That's why I provided the workflow_dispatch
option, so it could be run, even when the cron scheduler fails to run.
[!WARNING] It is recommended that you run this script with your JWT and not your password for safety concerns.
[!NOTE] For production, this should be run in GitHub Actions only.
If you want the script manually, then:
# Clone repository.
git clone git@github.com:lauslim12/japanese-duolingo-visualizer.git
cd japanese-duolingo-visualizer
# Use `poetry` as the package manager.
poetry shell
poetry install
# Put necessary environment variables, or else it will not work.
export DUOLINGO_USERNAME=...
export DUOLINGO_PASSWORD=...
export DUOLINGO_JWT=...
# Run script.
poetry run python3 main.py
For development, if you wish to develop the visualizer, you have to mock the data in the web/index.html
, more specifically, the getDataFromJSON()
function. You have to hard-code (change the response.json()
) and change it to something like the following:
{
"2022/07/28": {
"number_of_sessions": 27,
"session_time": 4792,
"streak": 1,
"xp_today": 754
}
}
You may have to add more than one data to make sure that it is rendering well enough. Developing the other components should be more straightforward (Python code and GitHub Actions as the infrastructure).
Ensure to run these scripts to keep the code quality consistent:
# run ruff, mypy, pytest
ruff check
ruff format
mypy
pytest --verbose --cov=src tests
Please also write tests if you want to add a new feature!
Aside from the names and projects written above, I would also like to thank:
MIT License.