subconcept-labs / ulangi

Ulangi is a language flashcards app with spaced repetition system and more.
https://ulangi.com
GNU General Public License v3.0
441 stars 61 forks source link

True randomization for spaced repetition #89

Closed Kwintenvdb closed 4 years ago

Kwintenvdb commented 4 years ago

First of all thanks for creating this wonderful app and making it open source.

I am using it to train on a list of 4000 words that are sorted by common usage in its respective language. This means that the most commonly used words of the language are sorted at the top, while the less commonly used words are at the bottom of the list. Since I've imported it in this order through Google Sheets, this is also the order the words appear in when doing a spaced repetition review. I don't care so much about learning the words in that order, and would prefer a mix of more "rare" and common words together.

The shuffling of the vocabulary list implemented https://github.com/ulangi/ulangi/commit/9f848976945d02d1a09e7196def34be8477ce67c in isn't true randomization on the database level, since the set of words are first fetched from the database, and are shuffled afterwards. In other words, it would fetch the first 10 most commonly used words, then shuffle them. Not very useful when you're already at an intermediate level of the language.

I could of course archive the most common words that I already know well, but I think true randomization would still be a very helpful feature.

jimmyloi commented 4 years ago

I can add randomization on database level but it might have significant performance impact when your database becomes large.

The database engine does not provide a way to pick random rows. The only way to get random rows is assigning a random number to each row. Then shuffle the whole database, then sort them based on that random numbers and pick the top rows.

https://stackoverflow.com/questions/580639/how-to-randomly-select-rows-in-sql

Kwintenvdb commented 4 years ago

I'm not sure I agree that it would be as expensive as you assume. I've prepared a very rudimentary example here: https://www.db-fiddle.com/f/giSzk8GW8ZjicBs3zGc1gm/1

Even ordering 20k rows by random (which is far more data than I would expect the vast majority of users would have in a single set of flashcards) executes in about 10ms here.

Even so, ORDER BY RAND() isn't the only solution, and you could for example also assign a randomly generated number to each flashcard as they are stored in the DB, and sort by those later.

Either way it could very well be implemented as an optional feature in the spaced repetition review so that users can choose whether they want this random ordering or not.

As mentioned, this feature is very important to me, and I'd be willing to help out with the development where needed.

EDIT: I just realized you are querying a local Sqlite database instead of the MySQL server in the backend. I've tested the same example on a much larger dataset without any noticeable performance hitch. I cannot attest to how well this performs on mobile devices running this query locally though.

EDIT2: I've tested this on a local Sqlite client running on an iPhone 11 Pro. Doing a SELECT * FROM data ORDER BY RANDOM() LIMIT 50 over a table with 1 million rows takes 130ms. Of course you will probably have a little more overhead from the Sqlite client used in the app, but considering how I wouldn't expect any set of flashcards to have over 10k rows, this should still be blazing fast.

jimmyloi commented 4 years ago

@Kwintenvdb This is only possible because we are querying everything from the local database. The remote database cannot handle this bottleneck. I think it will work fine for most users (might be slow on old Android devices). This feature is easy to add. I will implement it soon.

Kwintenvdb commented 4 years ago

Fantastic! Looking forward to using this feature.

jimmyloi commented 4 years ago

Fixed in aa1b9a1816bfdfcfbf3e1bce036a0f895316a2c0

Kwintenvdb commented 4 years ago

Awesome! Thanks a lot. How long does it generally take for a new release to be published on the App Store?

jimmyloi commented 4 years ago

It's already available for Android now. For iOS, it sometimes takes a few days.