google / sagetv

SageTV is a cross-platform networked DVR and media management system
http://forums.sagetv.com/
Apache License 2.0
267 stars 174 forks source link

Library Importer repeatedly processes the same files #351

Open JREkiwi opened 7 years ago

JREkiwi commented 7 years ago

I'm seeing a large number of files that show up during the regular library import process.

[LibraryImporter@dff67a] testFile= [LibraryImporter@dff67a] Attempted to add already existing file path to database: [LibraryImporter@dff67a] New Library File MediaFile Importer.txt

It seems to be the same 100 or so files on every scan. I've done a full rescan/redetect of the library, but the only thing I can see that is common is that they are all .mp4 files

Narflex commented 7 years ago

There is a case difference in the filenames which makes them not match doing the String based naming comparison. It only matches when it does the File based comparison when trying to actually add the file. If you move them out of the directory, do a rescan, then move them back and scan again it won't do this anymore because then it'll be storing them with the current filename. The code could probably be changed to resolve this...but IIRC, there's a more intensive technique used when trying to add the MediaFile which ensures we're not duplicating something based on a case difference when the OS doesn't care about case, and trying to do that during the scanning process might have then caused scanning to use too much CPU.

Sat 8/12 13:04:31.047 [LibraryImporter@dff67a] testFile=Shaun the Sheep s05e08 Dangerous Deliveries.mp4 Sat 8/12 13:04:31.047 [LibraryImporter@dff67a] Attempted to add already existing file path to database: \JREkiwi\avi2\Comedy\Shaun the Sheep\Season 05\Shaun the Sheep s05e08 Dangerous Deliveries.mp4 returning exisitng MF of: MediaFile[id=27722999 A[28812191,26934327,"Shaun the Sheep",0@0916.08:06,7,V] mask=V host=BOB encodedBy= format=Quicktime 0:07:00 4379 kbps [#0 Video[H.264 50.0 fps 1280x720 16:9 4243 kbps progressive]#1 Audio[AAC 48000 Hz 2 channels 125 kbps idx=1 eng]{Producer=, SeriesInfoID=0, PropertiesWrittenBy=CMT, MediaType=TV, Copyright=2016 British Broadcasting Corporation;all rights reserved, AlbumArtist=BBC TV, Director=, Actor=, Choreographer=, Comment=Shaun decides to take on the role of Bitzer's therapist;with unexpected consequences., MediaProviderID=tvdb, Writer=, MediaProviderDataID=79890, ScrapedBy=Phoenix, Guest=, Host=, MediaTitle=Shaun the Sheep, ScrapedDate=1474093388515}] \JREkiwi\avi2\Comedy\Shaun the Sheep\Season 05\Shaun the Sheep S05E08 Dangerous Deliveries.mp4, Seg0[Fri 9/16 8:06:57.198-Fri 9/16 8:13:57.198]]

JREkiwi commented 7 years ago

Thanks, that got rid of those already existing file path messages.

Now I'm seeing a lot of Redetecting format for TV files on another different group of files on every library import scan. It's always the same files. I'll run another reindex and see if that helps redetect.txt

I only noticed this stuff because I was looking in the logs to try and determine why I was getting regular (hourly) stuttering in playback of Blu-Rays off my NAS and discovered that it coincided with the Library Import scan

JREkiwi commented 7 years ago

One thing that I have found that is common for all these files repeatedly doing the redetecting format is that they are all TV Files that are in an imported videos directory, but it's not all the TV files that are in the imported videos directories that are doing it.

JREkiwi commented 7 years ago

OK. these redetecting format messages only appear when triggering a scan via STV (Setup/Scan Imported Media). They don't occur when the scheduled scan runs.