FreeLanguageTools / vocabsieve

Simple sentence mining tool for language learning
GNU General Public License v3.0
394 stars 30 forks source link

Hindi lookup on Wiktionary not working #171

Closed madisonruppert closed 2 months ago

madisonruppert commented 3 months ago

Describe the bug Wiktionary lookup isn't working for any Hindi words I've tried (though they're in Wiktionary and you can search for them). It just shows as blank, while the Google Translate lookup works fine. I've tried to make my own local dictionary using Kaikki, but it is in JSONL format not JSON, so it won't allow it to be imported.

However, if you click "open webpage" it successfully opens the page to the definition page for the selected word. It seems like VocabSieve is unable to read the Hindi Wiktionary pages?

To Reproduce Steps to reproduce the behavior:

  1. Set target language as Hindi
  2. Copy any Hindi word
  3. Wiktionary dictionary will be blank
  4. Click "open website" to show that it is successfully reading the Hindi word and the link is working, but it is not parsing the Wiktionary entry.

Expected behavior Wiktionary definition displayed properly.

Screenshots If applicable, add screenshots to help explain your problem.

Logs 2024-08-27 15:57:22.954 | DEBUG | vocabsieve.main:pollClipboard:137 - Polling: Clipboard text changed to '''समय''' 2024-08-27 15:57:30.871 | DEBUG | vocabsieve.ui.searchable_boldable_text_edit:bold:11 - bolding समय 2024-08-27 15:57:30.876 | DEBUG | vocabsieve.ui.multi_definition_widget:lookup:135 - Looking up समय in [<vocabsieve.sources.wiktionary_source.WiktionarySource object at 0x13c4dd410>] 2024-08-27 15:57:30.877 | INFO | vocabsieve.sources.wiktionary_source:_lookup:20 - Looking up समय in Wiktionary 2024-08-27 15:57:30.877 | DEBUG | vocabsieve.ui.multi_definition_widget:lookup:135 - Looking up समय in [<vocabsieve.sources.google_translate_source.GoogleTranslateSource object at 0x13c4dca10>] 2024-08-27 15:57:30.879 | INFO | vocabsieve.sources.forvo_audio_source:_lookup:173 - Forvo lookup समय 2024-08-27 15:57:33.214 | DEBUG | vocabsieve.ui.multi_definition_widget:run:70 - LookupWorker: looked up समय in Google Translate in 2.34 seconds 2024-08-27 15:57:33.215 | DEBUG | vocabsieve.ui.multi_definition_widget:appendDefinition:164 - All sources have been looked up 2024-08-27 15:57:33.551 | ERROR | vocabsieve.sources.wiktionary_source:_lookup:26 - Failed to get data from Wiktionary: HTTPError('404 Client Error: Not Found for url: https://kaikki.org/dictionary/Hindi/meaning/%E0%A4%B8/%E0%A4%B8%E0%A4%AE/%E0%A4%B8%E0%A4%AE%E0%A4%AF.json') 2024-08-27 15:57:33.556 | INFO | vocabsieve.sources.wiktionary_source:_lookup:20 - Looking up समय in Wiktionary 2024-08-27 15:57:34.054 | ERROR | vocabsieve.sources.wiktionary_source:_lookup:26 - Failed to get data from Wiktionary: HTTPError('404 Client Error: Not Found for url: https://kaikki.org/dictionary/Hindi/meaning/%E0%A4%B8/%E0%A4%B8%E0%A4%AE/%E0%A4%B8%E0%A4%AE%E0%A4%AF.json') 2024-08-27 15:57:34.058 | DEBUG | vocabsieve.ui.multi_definition_widget:run:70 - LookupWorker: looked up समय in Wiktionary (English) in 3.18 seconds 2024-08-27 15:57:34.058 | DEBUG | vocabsieve.ui.multi_definition_widget:appendDefinition:164 - All sources have been looked up 2024-08-27 15:58:08.433 | DEBUG | vocabsieve.main:getKnownDataOnThread:426 - Some data sources aren't available, not getting known data now

Desktop (please complete the following information):

1over137 commented 2 months ago

Should be fixed in 0.12.1