Open wtanksleyjr opened 1 year ago
I'm not sure what is going on here but it would be helpful to see what the actual inode value is so I added it in the logs for the latest release. If you have this happen again that same line of the log you shared will include the inode value.
100% will check for that.
-Wm
On Sat, Feb 11, 2023 at 3:55 PM advplyr @.***> wrote:
I'm not sure what is going on here but it would be helpful to see what the actual inode value is so I added it in the logs for the latest release. If you have this happen again that same line of the log you shared will include the inode value.
— Reply to this email directly, view it on GitHub https://github.com/advplyr/audiobookshelf/issues/1447#issuecomment-1426901479, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJ7H6L3WCKK3Y5T2W7SF63WXARFVANCNFSM6AAAAAAUIIVNMY . You are receiving this because you authored the thread.Message ID: @.***>
@advplyr
This happens on my server sometimes after I do scan.
A couple things I've noticed:
metadata/cache/covers
fixes the problemI have v2.2.15
running and the cover problem just happened again but I don't see the Inode values and debug is on. Here's the log if it helps:
2023-02-18.txt
Book with incorrect cover is Trackers (li_o1kwryisuc929ch8ix
)
Cover being shown is Polar Shift (li_83e4qfl4ce94b1ibtq
)
Hope this helps
Wow, it happened ... I didn't expect to ever see it again, but this time I got something recent enough that I even have the logs!
163 [2023-03-22 15:31:58] DEBUG: [Scanner] scanFolderUpdates: Library item found by inode value=63328265. "Solo Leveling, Vol. 2 (Novel) - B09GHJTKSJ => 9524067" (Scanner.js:585)
Here's a lightly edited log (I searched for the two books concerned and kept everything between the two islands of log entries that mention them). I have the full logs as well, if you want.
(The second item is in a folder with a numeric name, in case you're wondering - it's a unique ID. The other one is how I have Libation configured to save my Audible books.)
The sequence of events here was that I'd returned and deleted "Solo Leveling", so the item is marked missing; but in the meantime I checked out "The Staff Engineer's Path". Apparently the latter item is somehow seen to have the same inode as the former (you can now see its value right above).
One question ... what file is being examined for an inode there?
Ah that could be it. Just to clarify, you first deleted the book folder (which would have freed up that inode value), then you added the new book which got incorrectly matched?
The inode being used is the folder
That's correct in this case, although not in any of the previous cases (those all involved both books still existing).
Additionally, when you compare inodes, are you guarding against comparing them across filesystems (inodes are not unique across filesystems)? Can you even tell from inside Docker? Perhaps this is the cause of the other guy who posted here involving both an SSD and HDD. In my case I know they're the same filesystem and even the same Docker mount point so they're sure to be unique.
On Wed, Mar 22, 2023 at 1:52 PM advplyr @.***> wrote:
Ah that could be it. Just to clarify, you first deleted the book folder (which would have freed up that inode value), then you added the new book which got incorrectly matched?
The inode being used is the folder
— Reply to this email directly, view it on GitHub https://github.com/advplyr/audiobookshelf/issues/1447#issuecomment-1480244946, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJ7H6OXGXOUV6USY3AFGCLW5NRCPANCNFSM6AAAAAAUIIVNMY . You are receiving this because you were mentioned.Message ID: @.***>
My gut feeling is that this matching is there for only one special case: if the "newly discovered" folder is actually an already-scanned folder. The implication is you're detecting renames, which makes sense and I don't know why it's generating false positives except for this new case.
But it seems to me the current method isn't the greatest way to handle renames, because it creates both an orphaned book object and a new book object for the rename even when it's correct. And the new book object contains two layers of metadata, the result of the initial scan on the newer folder with the nonblank items from the older item applied over it (which, by the way, easily produces very bad results). It would seem better to process the rename detection by changing the folder name on the old book's object, instead of by creating a new book and then overwriting its metadata.
I'm trying to think of ways to handle this correctly, and it's hard to do. One thing that makes sense is to never treat a discovered inode as a rename when the book that has the old inode still exists -- that's a quick test that would completely stop all of the cases I reported above even though we still don't know why they happen.
That wouldn't stop my new problem, of course. But that's going to be a very rare problem, since it only happens if I leave a deleted item just sitting there while I load new items. It's rather a bad problem, mind you, because it can result in completely overwriting a new book's info with an old book. But I can't think of how to prevent it apart from keeping it rare ... maybe when a book's marked as completely missing, remove its inode record after finishing the scans as completely as possible? That way rename detections can still be done the old way, but inode reuse won't kill us (unless the system reuses the inode really quickly).
But it seems to me the current method isn't the greatest way to handle renames, because it creates both an orphaned book object and a new book object for the rename even when it's correct. And the new book object contains two layers of metadata, the result of the initial scan on the newer folder with the nonblank items from the older item applied over it (which, by the way, easily produces very bad results). It would seem better to process the rename detection by changing the folder name on the old book's object, instead of by creating a new book and then overwriting its metadata.
This was how it was originally built but there is a bug somewhere. I opened an issue for this a while ago https://github.com/advplyr/audiobookshelf/issues/1161
This issue came up again and has me looking at why some filesystems have such small inode values. This comment had some insight https://github.com/advplyr/audiobookshelf/issues/1733#issuecomment-1527178564
I don't think there is a way to detect the actual filesystem from inside docker but if there is that would be the best solution.
I'm trying to think of ways to handle this correctly, and it's hard to do. One thing that makes sense is to never treat a discovered inode as a rename when the book that has the old inode still exists -- that's a quick test that would completely stop all of the cases I reported above even though we still don't know why they happen.
This is a good idea.
For reference the inode value on my XFS filesystems are this long 649362779914890946
. So even if I had my audiobooks spread across 100 drives the likelihood of a collision would be very small.
I was finally got around to importing a bunch of audio books I picked up from old Humble Bundles in the past to m4b and adding them to my library. I added like ~60 books at once, and this trashed the metadata for like 50 books got trashed.
My system has is using zfs, and I have separate datasets I use for my audible backups, and the directory I use for books I purchased outside audible.
As an example. One of the books was the Big Finish book Phantasmagoria which I added to purchased_misc doing this trashed the metadata for the book 'The Happiness Hypothesis'.
Stat output
$ stat \
audible_backup/The\ Happiness\ Hypothesis\ \[B07D5HTRPZ\]/ \
purchased_misc/Phantasmagoria/
File: audible_backup/The Happiness Hypothesis [B07D5HTRPZ]/
Size: 6 Blocks: 17 IO Block: 512 directory
Device: 30h/48d Inode: 8717 Links: 2
Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2022-09-05 07:39:03.923754797 +0000
Modify: 2022-09-05 07:39:06.943694478 +0000
Change: 2022-09-06 05:55:48.840891042 +0000
Birth: -
File: purchased_misc/Phantasmagoria/
Size: 3 Blocks: 1 IO Block: 512 directory
Device: 2eh/46d Inode: 8717 Links: 2
Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2023-07-12 21:45:04.321678078 +0000
Modify: 2023-07-12 21:45:04.321678078 +0000
Change: 2023-07-14 05:13:27.441428101 +0000
Birth: -
Audiobookshelf logs
2023-07-14 05:13:29 DEBUG [Watcher] File Added /content/audiobooks/purchased_misc/Phantasmagoria/Phantasmagoria.m4b
2023-07-14 05:13:29 DEBUG [Watcher] Modified file in library "audiobooks" and folder "fol_0o7kftyldks4h6jd8o" with relPath "/Phantasmagoria/Phantasmagoria.m4b"
2023-07-14 05:13:33 INFO [Server] 1 Files Changed
2023-07-14 05:13:33 DEBUG [Scanner] Scanning file update groups in folder "fol_0o7kftyldks4h6jd8o" of library "audiobooks"
2023-07-14 05:13:33 DEBUG [Scanner] scanFolderUpdates fileUpdateGroup [object Object]
2023-07-14 05:13:33 DEBUG [Scanner] scanFolderUpdates: Library item found by inode value=8717. "The Happiness Hypothesis [B07D5HTRPZ] => Phantasmagoria"
2023-07-14 05:13:33 DEBUG [Scanner] Folder update for relative path "Phantasmagoria" is in library item "The Happiness Hypothesis [B07D5HTRPZ]" - scan for updates
2023-07-14 05:13:33 WARN [LibraryItem] Check scan item changed folder fol_41bzqfhnyn7l3wnjak -> fol_0o7kftyldks4h6jd8o
2023-07-14 05:13:33 DEBUG [LibraryItem] "The Happiness Hypothesis [B07D5HTRPZ]" check scan cover removed
2023-07-14 05:13:33 DEBUG Library Item "/content/audiobooks/purchased_misc/Phantasmagoria" Media file scan took 79ms with 1 audio files averaging 79ms per MB
2023-07-14 05:13:33 DEBUG [Book] Tracks being rebuilt...!
2023-07-14 05:13:33 DEBUG [Book] setChapters: Using embedded chapters in first audio file /content/audiobooks/purchased_misc/Phantasmagoria/Phantasmagoria.m4b
2023-07-14 05:13:33 DEBUG [FfmpegHelpers] Extract Cover Cmd: ffmpeg -i /content/audiobooks/purchased_misc/Phantasmagoria/Phantasmagoria.m4b -y -map 0:v -frames:v 1 /metadata/items/li_ee8ct39mbne8iuzu8g/cover.jpg
2023-07-14 05:13:33 DEBUG [FfmpegHelpers] Cover Art Extracted Successfully
2023-07-14 05:13:33 DEBUG [Scanner] Saved embedded cover art "/metadata/items/li_ee8ct39mbne8iuzu8g/cover.jpg"
2023-07-14 05:13:33 DEBUG [LibraryItem] Success saving abmetadata to "/metadata/items/li_ee8ct39mbne8iuzu8g/metadata.abs"
2023-07-14 05:13:34 DEBUG [DB] Library Items updated 1
2023-07-14 05:13:34 DEBUG [Scanner] Folder scan results [object Object]
Anyway my first suggestion is if you see an 'File Added' that where the inode matches an existing book. Add a check so that you don't update the old book, if the files still exist for the old book.
As of 2.4.3 the scanner is no longer checking for inode first. It is looking for exact matching file path first then falling back to inode. There is more to be done here because you could still have this collision for renames.
Describe the issue
Docker-compose running ABS 2.2.12 on Ubuntu Linux 22.04; all folders mapped to the same 8TB HDD (not SDD). Prefer OPF data, and ODMs as well. Only one Library configured (I've reported this in the past when I had two libraries, but I wasn't able to capture any real information at the time).
I've noticed this many times, but always too late to get useful logs: I'd remember loading a book, but when I search for it, it isn't there. So I re-download. Later, I'd find a book that has the same cover as the one I re-downloaded but some other book's title. When I look at its folder and files, I find that the title matches the files, but the cover is from the newly missing book.
Interestingly, it turns out the "missing" book is right there on the drive where I'd put it, and absolutely nothing is ever suspicious about either the original book or the one that replaced it leaving only its cover - nothing's mixed up on the disk. So this is not a data loss problem, only metadata confusion.
So I've finally noticed this happening in time to actually capture logs -- the problem was that this is apparently random, so it usually gets books I don't notice because I loaded them a long time ago, or for which my logs are long gone.
The smoking gun is a line saying something like
[2023-01-27 03:12:49] DEBUG: [Scanner] scanFolderUpdates: Library item found by inode value "olderbook => newerbook" (Scanner.js:584)
. (Those two are the names of the folders directly containing the books.) I can see this line is intended to handle hard links, but every time it appears in my repo it's mistaken, there's no hard link there (if you want I'll deep-scan for duplicate inode numbers, but I've already confirmed none of the three books in this example share any common inode numbers for any of their files).My first example involves two "=>" lines that I actually WATCHED swap the same cover's title twice - it was random that I was looking in the right place at the right time. The original book is "The Skull of the World"; it had been loaded and Matched on the 22nd, and today the 26th I happened to be looking at newly loaded books, and I saw the cover for Skull of the World appear with the title "American Resistance", which I'd just gotten. Within a few second, the title changed again to "Unfree Speech" (still keeping the same "Skull" cover). Needless to say, both of those books had just been moved into the repo - but now I only have one book with the wrong cover and a jamble of metadata left over because the nonfiction books didn't have a subtitle or series so the resulting book had both of those from the original fiction book.
How's that for an intro? Any questions? Do I need to provide logs? Can I trim them a little? I've got them saved in full.
Steps to reproduce the issue
Audiobookshelf version
2.2.12
How are you running audiobookshelf?
Docker