Bulk Download Feature - Not Working After New Update

AkalankaRan commented 4 weeks ago

This tool has been fantastic for learning Finnish. However, after the latest update, the bulk audio download feature for a deck no longer works as it did before. Previously, this function operated smoothly, but now it's only possible to download audio for single cards using Forvo. Many users and I would greatly appreciate it if you could restore the previous bulk download feature for decks.

Rascalov commented 3 weeks ago

Hey,

Thanks for the report, I expect to have time for this on friday

AkalankaRan commented 3 weeks ago

Hey,

Thanks for your prompt reply! I really appreciate it.

I wanted to share some observations after spending 3-4 hours trying the bulk download feature last night. It worked in a few cases, and I noticed the following:

If the audio field already has previous audio, the bulk download skips it instead of downloading new audio. In the previous version, the tool cleared all existing audio and text from the target field before downloading new content, even when I had the "clear fields" checkbox selected.
Sometimes, the bulk download worked after I manually downloaded audio for one word first and then switched to bulk mode.

On a side note, I developed a Python program using BeautifulSoup three years ago. It scrapes word definitions, examples, and audio from Oxford Learner's Dictionaries (https://www.oxfordlearnersdictionaries.com/) for English words.

Over the last couple of years, I’ve used it for more than 8,000 English words. The program generates a CSV file that can be imported into an Anki deck, and the audio files are copied into the media folder.

This might be useful for English vocabulary learners if you’re interested in integrating it into an Anki add-on. I’d be happy to send you the program with a full explanation. Integrating with Anki is new to me, and I haven't found the time to learn it.

I recently moved to Finland and have already learned over 2,000 Finnish language words using your add-on.

Thanks again for creating such an elegant tool to help learners like me!

Best regards, Akalanka

Rascalov commented 3 weeks ago

Thank you for the extra information. I honestly haven't looked at my bulk downloader since its release, so I'll take some time in the weekend to make it use the newer audio grabbing. Particularly though, it seems that injecting the audio source inside of the card is what's especially broken. As for your python program, you're welcome to open a new issue for it with the details. I'll take a look and respond to it there.

Rascalov commented 3 weeks ago

Took a bit longer than expected, but I updated it with the fix. If you open Anki, you should be getting a prompt for the update.

I'll close this issue, if you still experience problems, you can create another issue and I'll look into it

AkalankaRan commented 2 weeks ago

Hey,

Hi Rascalov,

Thank you very much for your attention to this matter.

I’ve been quite busy over the last weekend, but I managed to test the updated bulk downloader with a new deck containing a few Finnish words.

Unfortunately, I didn't get the expected results from the bulk download, although the one-by-one pronunciation downloads are working well.

I’ll take a few more days to test it under different conditions - different IP addresses etc, and some other adjustments.

I'll keep you posted with the results soon.

Thank you again for your support in helping me learn thousands of words with pronunciations in Anki.

Best regards, Akalanka

Rascalov commented 2 weeks ago

Thank you for the update. The one-by-one is intentional, but if you could give me the specific sentences then I can check in more ways. As for the IP address, if you are worried about getting blocked by forvo, no need. That issue is resolved.

Creat0r-1 commented 1 week ago

Hey are there any updates on this? I am facing the same problem, bulk download doesn't seem to work for single words or sentences anymore. I checked both methods "CDN + Forvo" and "Only Forvo". Secondly, I have noticed that some sentences does not come up with any results even though they exist on forvo site. For example you can check the following on Swedish: Har ni några frågor? Förstår du? Kan du säga det en gång till?

And lastly, is there a way to somehow disable the one-by-one word search audio in the code? I mean search for a phrase and if it doesn't exist skip it overall, instead of downloading 4 separate audio files i.e. "en gång i månaden" downloads 4 separate audio files. Great work nonetheless. Thank you.

Rascalov commented 1 week ago

Thank you for the examples, I think I can see what is going on. The sentences are there, but my lookup logic has some flaws when it comes to Punctuation (characters like ? , ! etc.) I'll see what I can do

Rascalov commented 1 week ago

Alright, it should work now. Give it a try if you want. As for the options to disable one-by-one word download, I'll take a look tomorrow.

Creat0r-1 commented 1 week ago

Thank you for your reply. I thought the config "ignorePunctuation": "True" did the job already.

I tested first 2 phrases seem to work now. Third one "Kan du säga det en gång till?" does not come up. Same for "Hur säger man ... på svenska?".

BUT, something changed now and the generated audio files are silent neither can I preview them. I tried to locate them on the collection folder and didn't manage to do so. I noticed that the generated files changed their ID (?) i.e. from [sound:elev-389066.mp3] to [sound:elev-ea2819c4e8891080a3e1e817df04f3145cfa8be84e07dcdf261bc58e27bd1b89.mp3].

Maybe forvo made again changes, the result is "Auto Fetch" continues to not work and "Select and fetch" provides results (?) but does not download audio files...

Rascalov commented 1 week ago

I thought the config "ignorePunctuation": "True" did the job already.

Like many things in this project, it's broken :P

BUT, something changed now and the generated audio files are silent neither can I preview them.

I haven't been able to reproduce this, the preview works fine on my end from the manual and automated fetch, can you tell me the operating system you're using?

I noticed that the generated files changed their ID

Yeah, the full story is that Forvo has blocked automated downloads pretty much entirely (anti bot/scraping stuff). The workaround I use revolves around a dataset of forvo audios that some people from Freemdict host, the id you see is a hash value of the mp3 file.

Creat0r-1 commented 1 week ago

So you re saying the auto fetch is working for you? Auto fetch is not working at all for me. It seems to search and complete without errors but no audios are written to cards of the deck.

Regarding manual fetch, single words work as before, I was referring to the phrases above, Forvo "Har ni några frågor?" shows a result now and writes the audio to flashcard field when I choose it. But the audio file is not in the collection folder (I searched) and thus is not playable. I thought maybe the long filename (?) has something to do with this. I am using Windows 10 x64. Should I update something on the addon? Thanks.

Rascalov commented 1 week ago

Alright, I see the issue now. If you update the addon, it should function now. Apparently punctuation in file names differs between Windows and Linux

Creat0r-1 commented 1 week ago

Amazing, now it seems to preview and download the audio files on the folder. I noticed though that the phrase "Kan du säga det en gång till?" still doesn't come up. Also Skriv! provides no results while Skriv does, so you might want to include (! .) and other special characters.

In addition, Auto fetch download continues to not work for me. I ll try on a new computer with fresh Anki installation when I have time and report back. Thank you again.

Rascalov commented 1 week ago

"Kan du säga det en gång till?"

Yeah, I looked over it on the dataset, seems to have missed that one. Just for you, I added it manually. In the long term I'll keep it up to date automatically

Also Skriv! provides no results while Skriv does, so you might want to include (! .) and other special characters.

Meh, search in general could be better. It's all jumbled together. In this case though, it seems that Skriv! with the exclamation is not on forvo either, only the full "skriv! - skriva - skriver - skrev - skrivit" (which you can't fetch either ;) ).

I ll try on a new computer with fresh Anki installation when I have time and report back.

Thank you for taking the time, hope it is a installation issue, but we'll see

Creat0r-1 commented 2 days ago

Ok I think I found what the problem is with auto fetch. I have a separate audio field for organization purposes of my cards, so I used that for the bulk download and it seemed to scan and download but in the end didnt work.

I tried to use the same Language and Audio field and it seems to download and register audios now. This behavior does not happen on manual download only on bulk. Can you confirm if you are using a separate audio field or the same field as language?

I guess I cant have it all, I ll live with merged audio and word field... :) But now we are going back to the one-by-one word download issue which is quite annoying. I wish you could somehow download only words or phrases as a single file if they exist on forvo, otherwise skip them altogether. Thanks for your work.

Creat0r-1 commented 1 day ago

Just for reference to anyone reading this and might find helpful disabling one-by-one word downloading, edit and comment out the loop (lines 37-40) in file ovrofCDN.py on the add-on folder.

AkalankaRan commented 1 day ago

Hi,

I made the following change to get bulk downloading to work correctly:

File: AnkiForvoAudioGenerator.py
Line 45: Corrected the logic so it works for inserting audio into a different field.

Thanks to @Creat0r-1 for pointing out that it works for inserting audio into the same field.

Also, big thanks to @Rascalov for the original coding. I used ChatGPT to help identify the issue and suggest corrections.

Cheers!

           ```
 # clear audio so the search won't include that part (separate from previous line)
                #fieldNameValue = self.clearPreviousInput(card.note()[target.targetFieldName], AudioClearingOptions.AUDIO_CLEAR)
                fieldNameValue = self.clearPreviousInput(fieldNameValue, AudioClearingOptions.AUDIO_CLEAR)

Creat0r-1 commented 19 hours ago

Thank you for correcting the audio field issue, I couldn't come around it. Now the add-on works as intended, at least for me. Happy downloading. :)

Rascalov / Anki-Simple-Forvo-Audio

Bulk Download Feature - Not Working After New Update #39