johnfactotum / foliate

Read e-books in style
https://johnfactotum.github.io/foliate/
GNU General Public License v3.0
6.45k stars 298 forks source link

Anki extension/plugin from definition #758

Open 1over137 opened 3 years ago

1over137 commented 3 years ago

I'd like to be able to create Anki cards while reading. Such cards should contain the sentence, an unknown word, and the definition of the unknown word. (link) This program which supports both text-to-speech as well as wiktionary seems ideal to me.

I would like an interface to:

The first one is the most needed, while the other two already seems possible with current version using some wrapper scripts.

I might be able to implement these features myself, but I am not familiar with epub.js. Can someone familiar with the project tell me how I should go about doing this?

johnfactotum commented 3 years ago

access the current sentence surrounding the selected word

I think WebKit (which Foliate uses) and some others support the non-standard Range.prototype.expand() or Selection.prototype.modify(). So one can do

const s = window.getSelection()
const r = s.getRangeAt(0)
r.expand('sentence')
// `r` now contains the whole sentence

It's simple but non-standard (so in theory could stop working at any moment). The alternative I think is to loop through all the offsets and adjacent nodes of the Range until you reach a sentence boundary which is a bit more involved.

The first one is the most needed, while the other two already seems possible with current version using some wrapper scripts.

There's currently no way of doing that in Foliate as it doesn't support plugins yet (#359).

1over137 commented 3 years ago

There's currently no way of doing that in Foliate as it doesn't support plugins yet (#359).

I was meaning that I could simply use the custom TTS program feature to achieve the send-to-anki feature, but that way I would not be able to use the current wiktionary page feature.

johnfactotum commented 3 years ago

Ah, yes, abusing the TTS feature. I thought about that, but not being able to get the Wiktionary page is less of a problem, as you can fetch Wiktionary's data in the script. You won't get the "current" page that way, which might be a problem if you'd like to get the non-inflected page, but at any rate you can try to resolve that outside of Foliate.

The problem is how to attach the sentence info along with the word. To make the whole thing work would require making a different interface and would probably require some changes in Foliate that I probably won't have time to do for some while to come.

Also, if I'm not mistaken, to properly select the whole sentence, one might need something like the Intl.Segmenter API as the rules are depend on the locale of the text. Not sure what rules are used in the Range.expand() API, or whether it would be standardized or removed in the future.

So I think there's a much easier workaround to all that:

  1. You select the whole sentence in Foliate, not just the word or phrase.
  2. After sending the sentence to your script, you would then select the word or phrase in your script's GUI.
  3. Pop-up your own Wiktionary page in your script, and you would then select the desired definition from there.
1over137 commented 3 years ago

Hm, seems like it might be less work to simply steal or recreate your wiktionary parsing script and possibly popup thing as well to create a desktop widget independent from reader apps. Do you somehow know of a way to get a link to uninflected form via wiktionary? Or is it better for me to perform the step before sending a request to wiktionary at all with a lemmatizer.

johnfactotum commented 3 years ago

From the definition HTML, you can extract the link inside the <span> with the .form-of-definition-link class.

For example, the definition of "inflected" includes the following:

<span class="form-of-definition use-with-mention" about="#mwt26" typeof="mw:Transclusion">
  simple past tense and past participle of
  <span class="form-of-definition-link">
    <i class="Latn mention" lang="en">
      <a rel="mw:WikiLink" href="/wiki/inflect#English" title="inflect">inflect</a>
    </i>
  </span>
</span>

See https://en.wiktionary.org/wiki/Module:form_of.

johnfactotum commented 3 years ago

By the way, some while ago I made a standalone Wiktionary app called Quick Lookup. The app is just a single GJS script file. With some work, one can probably adapt it as an app for making Anki cards.

It supports looking up words from the primary selection (i.e., the "middle click paste" clipboard) if you run it with the --selection option. The advantage is that you can feed it text from any app, including browsers, PDF readers, feed readers, etc., not just Foliate!

1over137 commented 3 years ago

I decided to spin up this project instead for more freedom and more universal application. https://github.com/FreeLanguageTools/ssmtool However, I would still like to have a method to easily select a sentence directly from foliate and copy it to the clipboard (not PRIMARY, since qt doesn't support it). Can I try to implement an option that will perform sentence selection upon right-click (appears to not be used in your program now)

1over137 commented 3 years ago

@johnfactotum Lately I've created browser extensions for ssmtool, which allows one-click card creation. https://addons.mozilla.org/en-GB/firefox/addon/click-copy-sentence/ In short what it does is: