lauslim12 / japanese-duolingo-visualizer

独学で日本語を勉強しています。これはDuolingo統計の進歩。Developed with Python, Poetry, GitHub Actions/Pages.
https://lauslim12.github.io/japanese-duolingo-visualizer/
MIT License
8 stars 4 forks source link

Enhanced Data Structure and Algorithms #14

Closed lauslim12 closed 2 months ago

lauslim12 commented 2 months ago

Description

From #13, I realized that the current state of the automation is slightly over-engineered and clunky.

After taking a look at the codebase again, getting some ideas about how to improve it with the help of Pydantic and Pytest, and finding some time to do this (as I'm very much interested in optimizing this even further, also general maintenance work), I implemented a change in the algorithm, data structure, and typing.

Algorithm

Before synchronizing the database, it will try to properly find the start and end dates by combining dict[str, DatabaseEntry] and list[Summary], where both of them will use the min and the max for the starting and ending dates, respectively. After this, the dates which are iterable will be generated by declaratively using functional programming (takehwhile, count). Once we have known all of the possible dates, the streak is generated by functional programming again (accumulate) to know whether we have broken our streak or not.

With this algorithm, it should now be possible to accurately find and synchronize dates effortlessly.

Data Structure

One thing that I find strange is that in a day, we can only have one information, but the data structure from the API is a list[Summary]. To leverage the type-system, I transformed it into dict[str, DatabaseEntry] so it could accurately represents the proper state. As this is implemented, impossible states should not be possible now.

Typing

A lot of typings are improved through the usage of Pydantic's BaseModel.

Web

Implemented better caching by calculating the data in advance, instead of deriving them in each calls, which can cause a computation cost.

Tests

Through tests have been created to ensure correctness, so now the coverage should be back to 100%.