morpheus65535 / bazarr

Bazarr is a companion application to Sonarr and Radarr. It manages and downloads subtitles based on your requirements. You define your preferences by TV show or movie and Bazarr takes care of everything for you.
https://www.bazarr.media
GNU General Public License v3.0
2.86k stars 223 forks source link

Indexing embedded subtitles while embed option is off #763

Closed ghost closed 4 years ago

ghost commented 4 years ago

Describe the bug

Bazarr is indexing a number of embedded subtitles. On each Radarr scan the below message appears. It takes considerably longer than the ordinary scan and is executing every 5 mins as part of the scheduled tasks, taking up resources.

Screenshot 2020-01-19 at 11 26 48

Thought it might be related to this, that it had detected existing archived embedded subs and was now re-scanning them: https://github.com/morpheus65535/bazarr/issues/378

So made sure that "Use Embedded Subtitles" was switched to off, then did a "Update all Movie Subtitles from disk" from the tasks page, but it is still doing the same thing.

Expected behavior With the embedded subtitles option off, it shouldn't be checking any embedded subtitles, new or old, or there should be a way to clear memory of any existing embedded subtitles.

Software (please complete the following information): Bazarr Version: 0.8.4 Sonarr Version: 3.0.3.688 Radarr Version: 3.0.0.2541 Operating System: Linux-4.19.80-v7+-armv7l-with Python Version: 3.8.1

Running the LinuxServer docker container.

morpheus65535 commented 4 years ago

The expected behaviour you are referring to is exactly what happen with Bazarr and I just tested it using the step you provided. I cannot reproduce your issue. If I disable embedded subtitles indexing and force a full update of movies, it take approximately 3 seconds for 450 movies. If I re-enable this settings and update again, it take approximately 30 seconds. The 5 minutes interval call to Radarr complete in less than 5 seconds. I'm pretty sure somethings wrong with your setup and it is probably related to path mapping for movies. Can you provide a debug log while reproducing your issue (clear it before)?

ghost commented 4 years ago

Might be linked to another issue that has appeared. In the logs there are errors:

BAZARR Error trying to get video information for this file: /movie/movieexample/mov.mp4

Has occurred for a handful of movies but not all.

When going to the movie in Bazarr I see that the paths in the logs are the same as the entry on the bazarr movie page.

Yet in Radarr (Aphrodite) that isn’t the path set for the corresponding movie. There is only one .mp4 affiliated with the movie entry in Radarr and it specifies /movie/newmovie/newmovierenamed.mp4

I did use the editor on Radarr to move, rename and reorganise the movies (about 3 weeks ago), and it appears for some it is maintaining the old dir path rather than updating to the new one now in the Radarr database.

Another criteria I forgot to mention, movies and subs all hosted through rclone.

Will try and get logs across tomorrow, let me know if other info/logs of other activities are required in light of this.

ghost commented 4 years ago

I enabled Use Embedded Subtitles and did a disk scan. The number of embedded subtitles being indexed every five minutes then went up to 164. I disabled Use Embedded Subtitles, scanned again, and it returned back to 129 where it was before. Suggests to me, that it is behaving normally in that it is not re-indexing subtitles already indexed, other than these 129 anomalies.

Looking across the Radarr DB file, there are entries for the old path and the new path of the issue movies, presumably the former being the history.

Looking across the Bazarr DB I see the old path entry for the problem movies, but none for the new path.

On a disk scan, there are 258 entries coming up as file doesn't exist, with the wrong file path.

Seems it has either cached old directory paths and not updated them, or is reading from the wrong part of the Radarr DB? What is best to send for debugging here, a log of a particular action, or the DBs?

Sonarr running on the same system and setup, without issues. But hasn't been through the same embedded subtitles scans in the past, or had major Series renames and moves.

--

On a separate note, if Radarr imports subtitles that came with the movie, will Bazarr pick those up or does it require a disk scan?

morpheus65535 commented 4 years ago

Bazarr is doing an API call to Radarr to retrieve movies movieFile path but only use the first provided. If Radarr's API is providing wrong information, how it can work well for Bazarr.

Regarding your last question, yes, the included external subtitles (and embedded one depending on settings) stored with the videoFile are indexed when the movie is added to Bazarr db.

ghost commented 4 years ago

Good point, so I checked the Radarr output, as best as I can tell how Bazarr does it:

http://URL/radarr/api/movie?apikey=XXXX

I get a list of all the movies returned, and nowhere in that list do I see a mention of the old paths.

I guess it means Bazarr is caching the old ones and not updating with the new?

I just did another Update Movie list from Radarr, just to make sure there hasn't been a recent change to Radarr. I then did another full disk scan, and I am still getting a whole bunch of movies with the old file path associated, despite those file paths not featuring anywhere in Radarr's API output.

No errors when doing the Radarr test in the connection settings, so looks connected ok. I have been getting subs for new movies recently too, so seems to be picking up newly added movies from Radarr API output.

morpheus65535 commented 4 years ago

Seems to be a strange issue... I've just tested with 0.8.4.1 in dev branch by altering a path in database then forcing a movies sync. The path get updated instantly. I would suggest to empty the movie table and let Bazarr re-index everything.

ghost commented 4 years ago

It appears all the affected movies have subtitles downloaded and affiliated with them, and the subtitles are in the correct file paths. Could it be that Bazarr is seeing the file path from the subtitle file matches the file path from the Radarr API output and then passes on the check for the movie file path directory?

I'm taking stabs in the dark a bit now, that was the last thing I could think of.

Not sure I understand what you mean by emptying the movie table. If I clear out the movies and re-index, will it not lose the knowledge of which subs it has already downloaded, where from and their rating?

ghost commented 4 years ago

I tried removing all the content from table_movie path in the .db, but the column seems to be set to not accept NULL and to only accept unique entries, so proved a little tricky. Didn't want to mess round with those settings. So ended up deleting the whole .db and reindexing the lot. Not ideal, but I exhausted the time I had. Apologies I can't close with a more constructive answer, beyond my abilities.

Thanks for all your help.

morpheus65535 commented 4 years ago

Sorry for the delay. You needed to delete the row, not empty the cell content. Re-open the issue if it comes back.