FooSoft / yomichan

Japanese pop-up dictionary extension for Chrome and Firefox.
https://foosoft.net/projects/yomichan
Other
1.06k stars 213 forks source link

[Bug] _onApiGetAnkiNoteInfo is slow, causing the "View added note" (duplicate note check) in the yomichan popup to be slow #2184

Open kuroahna opened 2 years ago

kuroahna commented 2 years ago

Perhaps this should be raised in anki-connect instead of Yomichan, but if I understand it properly, Yomichan calls anki-connect's APIs to do the duplicate card checking And we do database queries such as

  1. select id from notes where csum=? and mid=?
  2. select id from notes where csum=? and id!=?
  3. select did from cards where nid=?
  4. select id, field_at_index(flds, 0) from notes where csum=? and mid=? if duplicateScope != 'deck' and not duplicateScopeCheckAllModels which calls is_duplicate() and does the database query

It would be nice if we cached all the cards in the anki collection that we know exists and avoid having to query the database to see if it's a duplicate card or not.

Would it be as simple as going through all the notes in the anki collection on anki-connect startup, caching the note's first field checksum into a Set, and then when we want to check if a new note is a duplicate or not, we get the checksum of the new note's first field and check if it is in the Set? This will probably slow anki on startup since it has to build the cache, but perhaps this could be an optional option/flag? Also not sure how much memory this will take, but my computer has 16 GB of RAM which should be plenty, and I feel like most computers nowadays ship with 8 GB at least

I heavily use the duplicate card check feature, and when hovering some words/phrases like 今日 or 大体の, it takes about ~3 seconds to show that it is a duplicate in my anki collection. But for other words like 食べる, it takes about ~1 second. I have tried this on both firefox and chrome, and they're both the same speed.

I have ~17.5k cards created so far, and I have 34 dictionaries installed, so Yomichan feels somewhat sluggish.

I can share my anki collection if needed.

EDIT: Yomichan version: 22.4.4.0 Chrome: 102.0.5005.115 (Official Build) (64-bit) Firefox Developer Edition: 102.0b9 (64-bit)

kuroahna commented 2 years ago

I did some extra testing, and I found something very strange. I edited anki-connect's __init__.py making

    def isNoteDuplicateOrEmptyInScope(
        self,
        note,
        deck,
        collection,
        duplicateScope,
        duplicateScopeDeckName,
        duplicateScopeCheckChildren,
        duplicateScopeCheckAllModels
    ):
        return 2
        # Returns: 1 if first is empty, 2 if first is a duplicate, 0 otherwise.

        # note.dupeOrEmpty returns if a note is a global duplicate with the specific model.
        # This is used as the default check, and the rest of this function is manually
        # checking if the note is a duplicate with additional options.
        if duplicateScope != 'deck' and not duplicateScopeCheckAllModels:
            return note.dupeOrEmpty() or 0

where the function immediately returns that every card is a duplicate (no database queries) and it's even slower. Scanning a word like すき焼き (has 33 entries with my 32 enabled dictionaries) taking about ~8 seconds to return even though there shouldn't be any database queries. And strangely, it even freezes my anki during these 8 seconds. I try to move the anki window around but it freezes until it resolves the addNote request I assume? Scanning a word like 昨日 instead has 8 entries in the yomichan popup and is much faster but still hangs for ~3 seconds despite not having to do anything.

Making the function return 1 has the same effect as return 2. On the other hand, making the function return 0 makes the duplicate check fast, although takes about ~1 seconds to do the duplicate check for すき焼き and ~0.25 seconds to do the duplicate check for 昨日

I can only assume that this lag I'm experiencing is because we're throwing exceptions when calling createNote ?

        duplicateOrEmpty = self.isNoteDuplicateOrEmptyInScope(
            ankiNote,
            deck,
            collection,
            duplicateScope,
            duplicateScopeDeckName,
            duplicateScopeCheckChildren,
            duplicateScopeCheckAllModels
        )

        if duplicateOrEmpty == 1:
            raise Exception('cannot create note because it is empty')
        elif duplicateOrEmpty == 2:
            if allowDuplicate:
                return ankiNote
            raise Exception('cannot create note because it is a duplicate')
        elif duplicateOrEmpty == 0:
            return ankiNote
        else:
            raise Exception('cannot create note for unknown reason')

Anki hanging and freezing for ~8 seconds when hovering over a word for checking duplicates is really concerning though, is anki-connect blocking and running on the main thread? Or is it running on a separate thread?

kuroahna commented 2 years ago

I've done some more testing, I think the lag isn't coming from the createNote or isNoteDuplicateOrEmptyInScope functions. The lag is coming from calling findNotes

In anki-connect, I set

    def isNoteDuplicateOrEmptyInScope(
        self,
        note,
        deck,
        collection,
        duplicateScope,
        duplicateScopeDeckName,
        duplicateScopeCheckChildren,
        duplicateScopeCheckAllModels
    ):
        return 2
        # Returns: 1 if first is empty, 2 if first is a duplicate, 0 otherwise.
...

so that everything is a duplicate, and I also added some logs in the createNote function:

    def createNote(self, note):
        collection = self.collection()
...

        print("\n-------------")
        print("before duplicateOrEmpty")
        duplicateOrEmpty = self.isNoteDuplicateOrEmptyInScope(
            ankiNote,
            deck,
            collection,
            duplicateScope,
            duplicateScopeDeckName,
            duplicateScopeCheckChildren,
            duplicateScopeCheckAllModels
        )
        print("after duplicateOrEmpty")
...

And also in the findNotes API:

    @util.api()
    def findNotes(self, query=None):
        if query is None:
            return []
        print("Calling findNotes with query: ", query)
        return list(map(int, self.collection().findNotes(query)))

I ran anki-console.bat in Powershell, then hover on a word like 携える which has 3 yomichan entries (my J-J dictionaries show 携える[たずさえる], 携[けい], and 携 (no readings))


-------------
before duplicateOrEmpty
after duplicateOrEmpty
addons21\2055492159\__init__.py:210:byName is deprecated: please use 'by_name'
addons21\2055492159\__init__.py:214:byName is deprecated: please use 'by_name'
addons21\2055492159\__init__.py:219:model is deprecated: please use 'note_type'

-------------
before duplicateOrEmpty
after duplicateOrEmpty
addons21\2055492159\__init__.py:210:byName is deprecated: please use 'by_name'
addons21\2055492159\__init__.py:214:byName is deprecated: please use 'by_name'
addons21\2055492159\__init__.py:219:model is deprecated: please use 'note_type'

-------------
before duplicateOrEmpty
after duplicateOrEmpty
addons21\2055492159\__init__.py:210:byName is deprecated: please use 'by_name'
addons21\2055492159\__init__.py:214:byName is deprecated: please use 'by_name'
addons21\2055492159\__init__.py:219:model is deprecated: please use 'note_type'

-------------
before duplicateOrEmpty
after duplicateOrEmpty
addons21\2055492159\__init__.py:210:byName is deprecated: please use 'by_name'
addons21\2055492159\__init__.py:214:byName is deprecated: please use 'by_name'
addons21\2055492159\__init__.py:219:model is deprecated: please use 'note_type'

-------------
before duplicateOrEmpty
after duplicateOrEmpty
addons21\2055492159\__init__.py:210:byName is deprecated: please use 'by_name'
addons21\2055492159\__init__.py:214:byName is deprecated: please use 'by_name'
addons21\2055492159\__init__.py:219:model is deprecated: please use 'note_type'

-------------
before duplicateOrEmpty
after duplicateOrEmpty
Calling findNotes with query:  "word:謳コ縺医k"
addons21\2055492159\__init__.py:1137:findNotes is deprecated: please use 'find_notes'
Calling findNotes with query:  "word:縺溘★縺輔∴繧・
addons21\2055492159\__init__.py:1137:findNotes is deprecated: please use 'find_notes'
Calling findNotes with query:  "word:謳コ"
addons21\2055492159\__init__.py:1137:findNotes is deprecated: please use 'find_notes'
Calling findNotes with query:  "word:縺代>"
addons21\2055492159\__init__.py:1137:findNotes is deprecated: please use 'find_notes'
Calling findNotes with query:  "word:謳コ"
addons21\2055492159\__init__.py:1137:findNotes is deprecated: please use 'find_notes'
Calling findNotes with query:  "word:謳コ"
addons21\2055492159\__init__.py:1137:findNotes is deprecated: please use 'find_notes'

The duplicateOrEmpty returns instantly because we made it return 2, but even with the original database queries, it's fast. The slowness is coming from calling findNotes.

I'm not sure what these queries actually are, it seems like Powershell didn't properly render it, but it calls createNote and findNotes 6 times. I assume since there's 3 yomichan entries in the popup when scanning 携える, it calls createNote and findNotes for the View Added Note and Add Expression button, where createNote responds quickly, but findNotes is very slow, and does it sequentially, waiting for each one to finish

kuroahna commented 2 years ago

I wrote a small script testing findNotes using anki-connect

while true
do
    for i in $(seq 1 10)
    do
        curl localhost:8765/findNotes -X "POST" -d '{"action":"findNotes","version":6,"params":{"query":"deck:current"}}'
    done
    sleep 1
done

This sends 10 deck:current search queries to anki (I have 17.5k cards) every second. anki-connect responds almost instantly and has a very slight freeze, but doesn't freeze like it does when you scan a word using Yomichan

I found that we call _onApiGetAnkiNoteInfo() where we call const noteIdsArray = await this._anki.findNoteIds(cannotAddNotes); in the backend

But for some reason this lags, and it's only doing word:<THE_JAPANESE_WORD> query instead (a handful of cards instead of 17.5k) Could this lag just be due to the browser's async-await (promise)?

toasted-nutbread commented 2 years ago

Since I suspect this might have something to do with the large amount of cards you have, sharing an export of your Anki collection might be useful. Sharing your Yomichan settings file would be useful also, then I can try to reproduce.

Related issue from a while ago: #993

kuroahna commented 2 years ago
  1. ankiCollectionAndYomichanSettings.zip
  2. Dictionaries installed (uploaded to mega as this is 732 MB)

Anki Version ⁨2.1.53 (96bacf79)⁩ Python 3.9.7 Qt 6.3.0 PyQt 6.3.0 I am using the Windows 10 qt6 build Anki connect version 2022-05-16

Yomichan version: 22.4.4.0 Chrome: 102.0.5005.115 (Official Build) (64-bit) Firefox Developer Edition: 102.0b9 (64-bit)

Oh, I didn't know about that issue, this is definitely related

toasted-nutbread commented 2 years ago

Testing with your setup, I do see a bit of delay, but that all seems to be coming from the AnkiConnect side. Nothing looks out of the ordinary from _onApiGetAnkiNoteInfo in Yomichan.

findNoteIds duration measured from Anki: 1334.3s findNoteIds duration measured from Yomichan: 1380.1ms

kuroahna commented 2 years ago

On Chrome, I edited backend.js to get the time:

    async _onApiGetAnkiNoteInfo({notes, fetchAdditionalInfo}) {
        const t1 = performance.now();
        const results = [];
        const cannotAdd = [];
        const canAddArray = await this._anki.canAddNotes(notes);

        for (let i = 0; i < notes.length; ++i) {
            const note = notes[i];
            let canAdd = canAddArray[i];
            const valid = AnkiUtil.isNoteDataValid(note);
            if (!valid) { canAdd = false; }
            const info = {canAdd, valid, noteIds: null};
            results.push(info);
            if (!canAdd && valid) {
                cannotAdd.push({note, info});
            }
        }

        if (cannotAdd.length > 0) {
            const cannotAddNotes = cannotAdd.map(({note}) => note);
            const noteIdsArray = await this._anki.findNoteIds(cannotAddNotes);
            for (let i = 0, ii = Math.min(cannotAdd.length, noteIdsArray.length); i < ii; ++i) {
                const noteIds = noteIdsArray[i];
                if (noteIds.length > 0) {
                    cannotAdd[i].info.noteIds = noteIds;
                    if (fetchAdditionalInfo) {
                        cannotAdd[i].info.noteInfos = await this._anki.notesInfo(noteIds);
                    }
                }
            }
        }

        const t2 = performance.now();
        console.log(`getAnkiNoteInfo: ${t2 - t1}`)

        return results;
    }

Scanning the following words, I get:

  1. 今日 getAnkiNoteInfo: 1736.5
  2. 大体の getAnkiNoteInfo: 1645.8999999985099
  3. すき焼き getAnkiNoteInfo: 741
  4. 食べる getAnkiNoteInfo: 176.89999999850988
  5. 携える getAnkiNoteInfo: 161.89999999850988

Now, if in anki connect in __init__.py I edit the isNoteDuplicateOrEmptyInScope function so that everything is reported as a duplicate (I originally opened this issue to suggest that we avoid doing database query checks in anki-connect and keep an in-memory cache of all the known duplicate cards because I thought that was what was slowing down the duplicate checks in yomichan, but that doesn't seem to be the case):

    def isNoteDuplicateOrEmptyInScope(
        self,
        note,
        deck,
        collection,
        duplicateScope,
        duplicateScopeDeckName,
        duplicateScopeCheckChildren,
        duplicateScopeCheckAllModels
    ):
        # Returns: 1 if first is empty, 2 if first is a duplicate, 0 otherwise.
        return 2
...

Scanning the same words above, I get:

  1. 今日 getAnkiNoteInfo: 2939.2999999970198
  2. 大体の getAnkiNoteInfo: 2916.1999999955297
  3. すき焼き getAnkiNoteInfo: 7157.60000000149
  4. 食べる getAnkiNoteInfo: 2930.7999999970198
  5. 携える getAnkiNoteInfo: 728

すき焼き takes about 7.1 seconds to finish.

Adding some more logs in anki connect, if we edit findNotes:

    @util.api()
    def findNotes(self, query=None):
        if query is None:
            return []

        print("findNotes called with query: ", query)
        return list(map(int, self.collection().findNotes(query)))

And open up anki-console.bat to see the logs and scan すき焼き again:

findNotes called with query:  "word:縺吶″辟シ縺・
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:縺吶″繧・″"
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:縺吶″辟シ縺・
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:繧ケ繧ュ繝、繧ュ"
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:縺吶″辟シ"
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:縺吶″繧・″"
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:縺吶″辟シ"
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:繧ケ繧ュ繝、繧ュ"
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:縺吶″"
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:縺吶″"
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:螂ス縺・
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:縺吶″"
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:髫・
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:縺吶″"
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:驪、"
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:縺吶″"
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:謨ー螂・
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:縺吶″"
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:騾上″"
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:縺吶″"
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:謨ー蟇・
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:縺吶″"
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:迥・
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:縺吶″"
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:謚・″"
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:縺吶″"
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:貍峨″"
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:縺吶″"
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:荳サ蝓コ"
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:縺吶″"
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:貍・
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:縺吶″"
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:譴ウ縺・
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:縺吶″"
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:谺。"
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:縺吶″"
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:謨ク螂・
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:縺吶″"
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:謨ク蟇・
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:縺吶″"
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:謨ク遶・
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:縺吶″"
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:サカ"
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:縺吶″"
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:迥・
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:縺吶″"
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:鬆亥イ・
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:縺吶″"
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:蝮・
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:縺吶″"
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:譚・
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:縺吶″"
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:郢ヲ"
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:縺吶″"
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:隘キ"
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:縺吶″"
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:隘・
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:縺吶″"
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:迥ゅ″"
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:縺吶″"
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:遨コ縺・
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:縺吶″"
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:驟ク縺・
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'
findNotes called with query:  "word:縺吶″"
addons21\2055492159\__init__.py:1083:findNotes is deprecated: please use 'find_notes'

The findNotes run sequentially and are a bit slow to resolve, taking 7.1 seconds to finish

toasted-nutbread commented 2 years ago

Modify your AnkiConnect __init__.py and edit its config file to output the log to a file somewhere on your computer:

diff --git a/plugin/__init__.py b/plugin/__init__.py
index e51bbeb..f89faa8 100644
--- a/plugin/__init__.py
+++ b/plugin/__init__.py
@@ -83,7 +83,8 @@ class AnkiConnect:

     def logEvent(self, name, data):
         if self.log is not None:
-            self.log.write('[{}]\n'.format(name))
+            t = time.perf_counter()
+            self.log.write('[{0}] @ {1:0.4f}\n'.format(name, t))
             json.dump(data, self.log, indent=4, sort_keys=True)
             self.log.write('\n\n')
             self.log.flush()

Then, check the log file to compare how long it takes from [request] to [response]. Your 7.1seconds for 64 findNotes requests isn't too outlandish, as it's only 7.1/64=~0.1109375 seconds per lookup. This is a bit slower than what I'm seeing though, so I'm not sure what accounts for that. What are the specs of the device you are using?

kuroahna commented 2 years ago

Editing __init__.py with:

    def isNoteDuplicateOrEmptyInScope(
        self,
        note,
        deck,
        collection,
        duplicateScope,
        duplicateScopeDeckName,
        duplicateScopeCheckChildren,
        duplicateScopeCheckAllModels
    ):
        # Returns: 1 if first is empty, 2 if first is a duplicate, 0 otherwise.
        return 2
...

and

diff --git a/plugin/__init__.py b/plugin/__init__.py
index e51bbeb..f89faa8 100644
--- a/plugin/__init__.py
+++ b/plugin/__init__.py
@@ -83,7 +83,8 @@ class AnkiConnect:

     def logEvent(self, name, data):
         if self.log is not None:
-            self.log.write('[{}]\n'.format(name))
+            t = time.perf_counter()
+            self.log.write('[{0}] @ {1:0.4f}\n'.format(name, t))
             json.dump(data, self.log, indent=4, sort_keys=True)
             self.log.write('\n\n')
             self.log.flush()

and editing config.json to write the logs to my desktop, and scanning すき焼き, here are the logs: logs.txt

Yeah, it seems to be about ~0.11 seconds per lookup

The specs of my PC is pretty old now, but it should be plenty fast:

According to dxdiag: CPU: Intel(R) Core(TM) i7-2700K CPU @ 3.50 GHz (8 CPUs), ~3.5GHz (note I overclocked my CPU to 4.8GHz but it doesn't show up in dxdiag, it shows up in task manager) RAM: 16384 MB of RAM GPU: NVIDIA GeForce RTX 2060

I wrote a bash script that queries deck:current 66 times:

#!/bin/bash

for i in $(seq 1 66)
do
    curl localhost:8765/findNotes -X "POST" -d '{"action":"findNotes","version":6,"params":{"query":"deck:current"}}'
done

and ran time ./test.sh (I'm using WSL)

./test.sh  0.16s user 0.55s system 16% cpu 4.233 total

And looking at the logs as well: logsDeckCurrent.zip

8.5991 - 4.4325 = 4.1666 seconds for it to finish querying deck:current 66 times.

Would it be possible to somehow make findNotes happen in parallel? I assume that each call is running sequentially on a single thread, so it needs to wait for each one. Would we get any performance boosts if we could use multiple threads?

EDIT: Or, would it somehow be possible to make it so that instead of doing 66 findNotes calls, it just does 1 query to anki-connect instead? Ie, would it be possible to lower the amount of searches it needs to do (like when doing a database query, you select all the stuff you need in 1 query, instead of having 66 queries to get each information)

EDIT 2: I made another script which queries deck:current 66 times in separate processes instead of waiting sequentially:

#!/bin/bash

for i in $(seq 1 66)
do
    curl localhost:8765/findNotes -X "POST" -d '{"action":"findNotes","version":6,"params":{"query":"deck:current"}}' &
done

wait

And running it:

time ./test.sh
./test.sh  0.16s user 1.16s system 343% cpu 0.383 total

This takes about 0.383 seconds, that's almost 11 times faster

peldas commented 2 years ago

I have a suggestion that could speed this up too.

How about only checking the first 2-3 terms which would load nearly instantly, and then only checking anything else if the user scrolls down in the Yomichan window? You could put a loading GIF on each entry in place of the "View added note" button while waiting for a response from Anki.

I don't see the need to check every single term every time we open the window when most of the time, most users are not going to even look at them.

kuroahna commented 2 years ago

Yeah, I think that would make Yomichan feel a bit more snappier. Instead of const noteIdsArray = await this._anki.findNoteIds(cannotAddNotes); calling anki-connect's multi API and waiting for the response to finish calling findNotes 64 times on a single thread before rendering the View added note button for all 32 entries at once, we would call findNotes 64 times directly, loading the results as they get loaded (ie the first result would load fast since it's the first result, then it loads the 2nd, etc), which would probably need to change the display-anki code as well

Would this be a potential solution?

Also, unfortunately it seems impossible for anki-connect to be multi threaded due to Python's Global Interpreter Lock, and Anki/sqlite not being thread safe (although, I feel like findNotes and canAddNotes could be thread friendly since it's just reads to the database, and not writes). It could be possible to make a custom fork of Anki using the Rust backend, implementing a multi-threaded web server in Rust for AnkiConnect APIs, but I personally have 0 Rust experience, and don't know much about Anki's codebase.

toasted-nutbread commented 2 years ago

Would this be a potential solution?

It's certainly a potential solution, but it does add complexity and introduce other latency. The Anki connection is run on the background page, so requests have to be proxied through that. Additionally, every AnkiConnect HTTP request incurs additional delay. So overall, it will likely take longer to query every note status. This may be offset by the "perceived" speed improvement of the first few entries appearing more quickly, so the added complexity is probably a larger issue.

kuroahna commented 2 years ago

Yeah the added complexity doesn't sound too good.

I do notice however that anki-connect has the option 'apiPollInterval': 25,. If I change it to 'apiPollInterval': 1,, I can shave off 24 miliseconds for each request to anki-connect, which is a pretty decent speed boost.

I was playing around in anki-connect and trying to build an internal cache of all the findNotes queries to noteIds upfront when anki-connect starts up so that we don't have to query Anki again (will also need to patch the add/remove notes methods in anki so that it adds it to the cache), but doing 17.7k anki searches on my collection on a single thread is very slow 😅 so this is probably not a feasible/realistic solution

I was also trying to use python's multiprocessing module to get around the Global Interpreter Lock issue, but it seems like it's not possible either since it tries to open multiple instances of Anki which doesn't work since you can only have 1 instance of Anki open.

It sounds like the solution here is for anki-connect to be multi-threaded, but that's impossible due to Python's GIL limitation. Not sure what alternatives there are other than to make a fork of Anki, having the multithreaded web server built directly inside Anki via Rust.

Also, I was testing out your PR, this is pretty nice, retesting some of the searches like すき焼き and modifying anki connect's isNoteDuplicateOrEmptyInScope() function to always return 2 so that everything is a "duplicate", it went from 64 queries to 32 (2x faster!).

kuroahna commented 2 years ago

Hi @toasted-nutbread

I realized I didn't have to call findNotes to find the noteIds for ~17.7k search queries when building my internal cache since I already have them 🤦 . I made a fork of anki-connect which caches the checksums and queries to noteIds for the duplicate note checking, and properly updates the cache when adding/removing a note. The requests to anki-connect are very fast now, and _onApiGetAnkiNoteInfo now resolves within 20-100 ms, depending on how many requests it needs to do, compared to 700 ms or even 3000 ms before. From my quick testing, building the cache for my entire collection (~17.7k cards) uses about ~30 MB of RAM, and only takes ~1 second to build the entire cache

I only made it so that it works for the Yomichan settings

  1. Check for duplicates across all models: false
  2. Duplicate card scope: Collection

since those are my settings I use for Yomichan, but it could be improved to support other settings as well.

I'm not sure if we want to close this issue now? I don't know if we really want to push my fork to the main anki-connect repo as it may not be desirable for other users to have this internal cache and take extra memory to use the addon, and my code is pretty ugly (something I hacked together quickly), and this also means anki-connect's cache depends on Yomichan implementation for the queries which won't work for users who aren't using Yomichan, but I'm satisfied with my own solution to the problem.