ankidroid / Anki-Android

AnkiDroid: Anki flashcards on Android. Your secret trick to achieve superhuman information retention.
GNU General Public License v3.0
8.6k stars 2.23k forks source link

Improve Card browser scrolling performance #11889

Open oakkitten opened 2 years ago

oakkitten commented 2 years ago

I think Card browser scrolling performance is slightly suboptimal. The UI freezes a lot when you flick down or up, and when I drag the scrollbar I get about 3 FPS.

There seems to be a task that pre-renders the table when you scroll in a background thread. It only renders a little more than 2 screenfuls of rows. I tried to make that task render all rows, and here are the results:

Some of the problems with this are:

I think this can be drastically improved by taking these steps:

Also, currently browser is reset every time it is opened. If one day it keeps its state, it'd be useful to think about how it could be updated with new data. E.g. make a search in browser, go to reviewer, edit a card there, go back to search, card is updated with animation. Normally this is done via calculating a diff; this probably wouldn't be feasible for large search results.

viciousAegis commented 2 years ago

In my initial plan I have included switching to RecyclerView, and I believe I would be doing that, so we can cross that out 👍🏼

dae commented 2 years ago

Some of the performance problems can likely be solved with col.newBackend.backend.browserRowForId(). It's intended to be used on demand (eg with a RecyclerView or similar)

oakkitten commented 2 years ago

col.newBackend.backend.browserRowForId()

I am assuming that it ultimately calls this. I am not sure if this will be faster at all. Briefly looking at the code, it seems that it constructs both the card and the note objects from scratch. Not only some of these might not be needed for the query, but these appear to do extra queries; besides, these queries are not optimized the way cursor IO is supposed to be. So—correct me if I'm wrong—the speed increase here will be due to running Rust instead of Java, which will pale before the IO penalty which is mostly the same. (That is, if Rust is faster here—JIT can do miracles!)

dae commented 2 years ago

Creating card/note structures in Rust is reasonably fast, and makes the code simpler and more maintainable than manually deserializing data from rows. I don't think your cursor solution will work: for one, if the user very quickly scrolls down, you'll pay the cost of walking the btree in the same way that large offsets are inadvisable. And it's a moot point anyway, since AnkiDroid is delegating to Rust for the DB access, and AFAIK that implementation fetches the entire query upfront before paginating it back to the Java layer.

oakkitten commented 2 years ago

My general idea was that the actual IO, if using a cursor or a similar thing, can be very fast. For instance, this single SQL query is enough to show questions and answers for the search, which is what AnkiDroid shows by default, and it only takes 90ms in Anki on my ancient laptop—not sure if it should be faster than my phone:

SELECT cards.id, notes.mid, cards.did, cards.ord, notes.flds, notes.flags, notes.tags 
FROM cards, notes 
WHERE cards.nid = notes.id
AND notes.flds like "%a%"

And even if it takes a lot of time to compute stuff to be displayed, it should be less of an issue, since the data is already in memory and fit for random access, and you can easily distribute computation across the CPUs.


P.S. I tried running some silly benchmarks in AnkiDroid.

I would love to benchmark col.newBackend.backend.browserRowForId() but I'm not sure how to call it.

Silly benchmark code; output:

// legacy_schema=true
#0: found 14682 questions in 8.269301397s
#1: found 14682 questions in 39.042556650s
#2: found 14682 questions in 25.410680396s
#3: found 14682 questions in 427.548116ms

// legacy_schema=false
#0: found 14682 questions in 9.970149379s
#1: found 14682 questions in 19.664074354s
#2: found 14682 questions in 30.226884574s
#3: found 14682 questions in 448.768482ms

P.P.S. In desktop Anki, this takes around 1.5s:

for (id,) in mw.col.db.all("""
    SELECT cards.id FROM cards, notes WHERE cards.nid = notes.id AND notes.flds like "%a%"
"""):
    Card(mw.col, id).note()

Not making any conclusions from this, but this is around 17 times 90 ms, which is consistent with 8s ÷ 430ms, which is 19.

dae commented 2 years ago

There are multiple problems with your benchmarks:

When I tweak the code to use browserRowForId() and measure the full time of the query, the backend wins across the board, especially when the question needs to be rendered:

2022-08-12 11:09:56.662 25122-25206/com.ichi2.anki E/WeaselKt: runQuery round 2
2022-08-12 11:09:56.911 25122-25206/com.ichi2.anki E/WeaselKt: runQuery noQuestionDroid: found 20 rows in 239.160346ms
2022-08-12 11:09:57.026 25122-25206/com.ichi2.anki E/WeaselKt: runQuery noQuestionBackend: found 20 rows in 92.920019ms
2022-08-12 11:09:57.804 25122-25206/com.ichi2.anki E/WeaselKt: runQuery noQuestionDroid: found 40000 rows in 752.482667ms
2022-08-12 11:09:58.329 25122-25206/com.ichi2.anki E/WeaselKt: runQuery noQuestionBackend: found 40000 rows in 510.395245ms
2022-08-12 11:09:58.630 25122-25206/com.ichi2.anki E/WeaselKt: runQuery questionDroid: found 20 rows in 292.183416ms
2022-08-12 11:09:58.733 25122-25206/com.ichi2.anki E/WeaselKt: runQuery questionBackend: found 20 rows in 92.161677ms
2022-08-12 11:10:01.691 25122-25206/com.ichi2.anki E/WeaselKt: runQuery questionDroid: found 1000 rows in 2.943828616s
2022-08-12 11:10:03.168 25122-25206/com.ichi2.anki E/WeaselKt: runQuery questionBackend: found 40000 rows in 1.451204762s

Code:

package com.ichi2.anki
import com.google.protobuf.kotlin.toByteStringUtf8
import com.ichi2.libanki.Card
import com.ichi2.libanki.Collection
import com.ichi2.libanki.SortOrder
import com.ichi2.libanki.Utils
import timber.log.Timber
import kotlin.time.ExperimentalTime
import kotlin.time.measureTime

@Suppress("UNUSED_VARIABLE")
fun noQuestionDroid(col: Collection, limit: Int): List<String> {
    val cursor = col.db.database.query(
        """
        SELECT cards.id, notes.mid, cards.did, cards.ord, notes.flds, notes.flags, notes.tags, cards.reps 
        FROM cards, notes 
        WHERE cards.nid = notes.id
        AND notes.flds like "%a%"
        order by reps
        """
    )

    var n = 0
    return sequence {
        while (cursor.moveToNext() && n < limit) {
            val cid = cursor.getLong(0)
            val mid = cursor.getLong(1)
            val did = cursor.getLong(2)
            val ord = cursor.getInt(3)
            val fields = cursor.getString(4)
            val flags = cursor.getInt(5)
            val tags = cursor.getString(6)
            val reps = cursor.getInt(7)
            n += 1

            yield(reps.toString())
        }
    }.toList()
}

@Suppress("UNUSED_VARIABLE")
fun questionDroid(col: Collection, limit: Int): List<String> {
    val cursor = col.db.database.query(
        """
        SELECT cards.id, notes.mid, cards.did, cards.ord, notes.flds, notes.flags, notes.tags 
        FROM cards, notes 
        WHERE cards.nid = notes.id
        AND notes.flds like "%a%"
        order by sfld
        """
    )

    var n = 0
    return sequence {
        while (cursor.moveToNext() && n < limit) {
            val cid = cursor.getLong(0)
            val mid = cursor.getLong(1)
            val did = cursor.getLong(2)
            val ord = cursor.getInt(3)
            val fields = cursor.getString(4)
            val flags = cursor.getInt(5)
            val tags = cursor.getString(6)
            n += 1

            val model = col.models.get(mid)!!
            val splitFields = Utils.splitFields(fields)
            yield(col._renderQA(cid, model, did, ord, tags, splitFields, flags)["q"]!!)
        }
    }.toList()
}

@Suppress("UNUSED_VARIABLE")
fun noQuestionBackend(col: Collection, limit: Int): List<String> {
    col.backend.setActiveBrowserColumns(listOf("cardReps"))
    val ids = col.findCards("a", SortOrder.AfterSqlOrderBy("reps"))
    return ids.asSequence().take(limit).map { col.backend.browserRowForId(it).cellsList[0].text }.toList()
}

@Suppress("UNUSED_VARIABLE")
fun questionBackend(col: Collection, limit: Int): List<String> {
    col.backend.setActiveBrowserColumns(listOf("question"))
    val ids = col.findCards("a", SortOrder.AfterSqlOrderBy("sfld"))
    return ids.asSequence().take(limit).map { col.backend.browserRowForId(it).cellsList[0].text }.toList()
}

@OptIn(ExperimentalTime::class)
fun runQuery(col: Collection, name: String, builder: (col: Collection) -> List<String>) {
    System.gc()
    System.runFinalization()

    var size: Int
    val totalTime = measureTime {
        size = builder(col).size
    }
    Timber.wtf("runQuery $name: found $size rows in $totalTime")
}

fun runQueries(col: Collection) {
    // run a few times to prime DB cache
    for (i in listOf(1, 2, 3)) {
        Timber.e("runQuery round $i")
        runQuery(col, "noQuestionDroid", { noQuestionDroid(it, 20) })
        runQuery(col, "noQuestionBackend", { noQuestionBackend(it, 20) })
        runQuery(col, "noQuestionDroid", { noQuestionDroid(it, 40000) })
        runQuery(col, "noQuestionBackend", { noQuestionBackend(it, 40000) })

        runQuery(col, "questionDroid", { questionDroid(it, 20) })
        runQuery(col, "questionBackend", { questionBackend(it, 20) })
        runQuery(col, "questionDroid", { questionDroid(it, 1000) })
        runQuery(col, "questionBackend", { questionBackend(it, 40000) })
    }
}
oakkitten commented 2 years ago

Ahh this is great! It turns out that browserRowForId is so fast we can probably forgo caching entirely.

I wanted to test for thoughput because I thought that the only way to bring screen render under the fabled 16ms (or is it 8ms now?) would be to prepare all, or a lot of, rows in background. And if you prepare them in the background, it's reasonable to forgo latency to do the enitre job faster.

Turns out that for the same query I was running above, except with ORDER By notes.sfld and using browserRowForId:

This is so amazingly fast that it should give no frame drops even on old and slow devices.

github-actions[bot] commented 2 years ago

Hello 👋, this issue has been opened for more than 2 months with no activity on it. If the issue is still here, please keep in mind that we need community support and help to fix it! Just comment something like still searching for solutions and if you found one, please open a pull request! You have 7 days until this gets closed automatically

mikehardy commented 10 months ago

Hey there 👋 - this issue came up as something that is pending for the 2.17 release as it should fix #14353 in passing. It looks like there is functional code laying around somewhere for a couple of different approaches, but I don't see an actual PR open for this - is there a PR for this?

Note that with CardBrowser refactoring going on (👀 @david-allison ) it may be conflict-ridding and/or it may be in progress as part of the refactor already (but just not linked?)

So with the context out of the way, to get to the point: is there a current status + work effort in progress here? Just curious

Cheers :-)

david-allison commented 10 months ago

To note: I've got a patch here, to be applied after the refactorings are completed, but there's quite a lot of work:

This moves from CardCache to CardOrNoteId as the row primitive

That means that all operations in the card browser need to be able to transform the note Id to the card Id, or natively handle a NoteId

It also changes the "column" behaviour, as the set of Anki desktop columns don't match the current set of columns we provide

mikehardy commented 10 months ago

Okay, my priority in area of "things I control" then is to be a rapid reviewer on the CardBrowser refactors. Off I go to do that