leoncvlt / blinkist-scraper

📚 Python tool to download book summaries and audio from Blinkist.com, and generate some pretty output
191 stars 35 forks source link

Issue with --concat-audio #10

Closed Ditiae closed 3 years ago

Ditiae commented 4 years ago

A lot of blinks download the audio, then when it comes to combining, it must error but all the audio files get deleted.

leoncvlt commented 4 years ago

Does it display any error in the terminal? Do you have ffmpeg installed and added to the PATH?

Ditiae commented 4 years ago

No errors, no. Yes I have ffmpeg, --concat-audio works for many of them, just these specific ones work, then get deleted. I run the program again and it skips the ones I have audios for, but, when it gets to the ones that it deleted, it re-gets and then deletes again.

Ditiae commented 4 years ago

Video of how it behaves: https://streamable.com/w8cry5

Screenshot of the terminal output: https://i.imgur.com/JnHhWen.png

rocketinventor commented 4 years ago

Hi @Wea1thRS,

I haven't had issues like what you described in production, but I have seen something like it while messing around with the concat audio code.

The issue that I found was that the paths to the audio files were incorrect, meaning FFMPEG did not know where to find the files.

To help debug this, we need to see the temp.txt file that is generated with the paths for the audio...

Instructions: Go to the file: generator.py and find the lines (near the bottom):

  if (os.path.exists(files_list)):
    os.remove(files_list)

Then delete/comment out the line:

os.remove(files_list)

My recommendation is to replace it with a print line, such as:

  #os.remove(files_list)
  print(f"os.remove({files_list})") # for debugging

...And then find that text file and copy it here (or check yourself if the paths are correct).

Ditiae commented 4 years ago

Upon looking, it creates the concat'd file just fine, however, it fails to tag for reasons that I am unsure. Maybe the title is too long? I can't seem to figure this out.

Actually upon further testing seems it was just a limit of the number of subfolders and name lengths that windows decided to be very selective to enforce for.

rocketinventor commented 4 years ago

It seems similar to Issue #4.

Did you follow the instructions in the readme for long filenames? (https://github.com/leoncvlt/blinkist-scraper/blob/master/README.md#quirks--known-bugs)

Ditiae commented 4 years ago

I have that enabled, yeah. It seems to work with some files and not others. It seems very inconsistent as the HTML files and such work just fine, just the m4a's sometimes have issues. But yeah. After I took it out of the string of subfolders I had, it seems to be working now much better.

leoncvlt commented 4 years ago

Hey, sorry about those issues, the causes are kinda hard to pin down. Might it be that you have an outdated ffmpeg distribution? As far as I remember the long paths incompatibility was an issue for them too, but it eventually got fixed.

rocketinventor commented 4 years ago

@Wea1thRS By "out of the string of subfolders" you mean that you changed the path that main.py was running from? I did that, too, by using linked folders, but even so, I found some titles too long (without changing the windows settings).

Have you checked your version of FFmpeg to verify that that wasn't the source of the issue? - If it is up to date, then that makes a difference because more digging in will need to be done in order to track down this issue.

rocketinventor commented 3 years ago

@leoncvlt @Wea1thRS If there are no new updates on this issue, then I would recommend closing it.