McCloudS / subgen

Autogenerate subtitles using OpenAI Whisper Model via Jellyfin, Plex, Emby, Tautulli, or Bazarr
MIT License
453 stars 45 forks source link

'tuple index out of range' in logs when trying to transcribe local files. #45

Closed Boyda-DBA closed 5 months ago

Boyda-DBA commented 5 months ago

I have the docker setup to transcribe a folder of media. The log will show errors on many files and the error is always tuple index out of range. Some files seemingly work fine.

McCloudS commented 5 months ago

Can you provide the log? Not tons of testing has been done on the local/folder transcription. Any chance you're running Docker on Windows? Do any of your file names have commas? I might need to change the de-limiter.

McCloudS commented 5 months ago

Alright, I'm taking a guess, and you had a path with commas. If that's the case, I just updated subgen.py to use a pipe '|' as the delimiter for the paths. I shouldn't have used valid filename/path characters as the delimiter.

Boyda-DBA commented 5 months ago

Here is an example of the errors in logs on unraid. Screenshot 2024-02-05 140609

McCloudS commented 5 months ago

What is your TRANSCRIBE_FOLDERS set to?

Boyda-DBA commented 5 months ago

For debugging purposes I only have the one directory Screenshot 2024-02-05 141248

McCloudS commented 5 months ago

I swapped around some logic in subgen.py that might fix it. You'll have to do a re-pull.

Boyda-DBA commented 5 months ago

After removing the variables mentiones it still shows the error, but only for a .nfo file. Will run this for a bit before repulling to confirm if it is the issue. image

McCloudS commented 5 months ago

Bizarre... is_video_file checks the files to make sure they are a valid video container before trying to transcribe them. It shouldn't even get to that point with an .nfo

McCloudS commented 5 months ago

I added some checking that should prevent the tuple index out of range. If it continues to be an issue, try re-pulling and let me know. Closing for now, just re-comment if you need help.

Boyda-DBA commented 5 months ago

Still a problem after re-pulling image

McCloudS commented 5 months ago

Does it appear to be impacting the operation at all? Kind of puzzling to me, but it appears tied to for root, dirs, files in os.walk(path): which should be the tuple it's talking about.

Boyda-DBA commented 5 months ago

Based on the logs video files are still being transcribed so I will leave it as up. Just seems to not be excluding the non video files for some reason.

I have a pretty low level understanding of python, but is it possible the path_mapping function is erroring out and returning an unexpected value?

McCloudS commented 5 months ago

To be honest, this was my first python project, so it's new to me too. It shouldn't if it's set to false via USE_PATH_MAPPING. I'm also struggling to understand why nfo and image files are passing the is_video_file check.

I might need to do another update to remove the failed files from the queue, otherwise they'll stay there until container restart.

Boyda-DBA commented 5 months ago

Very cool first project. I will take a peak in my spare time and report my findings for what it's worth. Appreaciate the responses

McCloudS commented 5 months ago

I ran it on random directories on my computer and it threw out everything that wasn't a video file, so I'm still not sure how images and a nfo made it through on your end. I did slight optimizations last night, but I doubt it made any differences for your case.