advplyr / audiobookshelf

Self-hosted audiobook and podcast server
https://audiobookshelf.org
GNU General Public License v3.0
5.83k stars 404 forks source link

[Bug]: Recurring crash #2686

Closed sveinbjornpalsson closed 2 months ago

sveinbjornpalsson commented 4 months ago

Describe the issue

I'm not really sure what else I can add except these two things from the log.

Here's the log, the crash is in bold:

2024-02-28 12:53:39.090

INFO

=== Starting Server ===

2024-02-28 12:53:39.091

INFO

[Server] Init v2.8.0

2024-02-28 12:53:39.104

INFO

[Database] Initializing db at "/config/absdatabase.sqlite"

2024-02-28 12:53:39.183

INFO

[Database] Db connection was successful

2024-02-28 12:53:39.379

INFO

[Database] Db initialized with models: user, library, libraryFolder, book, podcast, podcastEpisode, libraryItem, mediaProgress, series, bookSeries, author, bookAuthor, collection, collectionBook, playlist, playlistMediaItem, device, playbackSession, feed, feedEpisode, setting, customMetadataProvider

2024-02-28 12:53:40.117

WARN

Removed 1 sessions that were 3 seconds or less

2024-02-28 12:53:40.124

INFO

[LogManager] Removed daily log: 2024-02-21.txt

2024-02-28 12:53:40.125

INFO

[LogManager] Init current daily log filename: 2024-02-28.txt

2024-02-28 12:53:40.222

INFO

[BackupManager] 0 Backups Found

2024-02-28 12:53:40.223

INFO

[BackupManager] Auto Backups are disabled

2024-02-28 12:53:40.248

INFO

[Watcher] Initializing watcher for "Fiction".

2024-02-28 12:53:40.251

INFO

[Watcher] Initializing watcher for "Podcasts".

2024-02-28 12:53:40.252

INFO

[Watcher] Initializing watcher for "Comedy".

2024-02-28 12:53:40.252

INFO

[Watcher] Initializing watcher for "non-fiction".

2024-02-28 12:53:40.252

INFO

[Watcher] Initializing watcher for "History & Biography".

2024-02-28 12:53:40.265

INFO

Listening on port :80

2024-02-28 12:53:40.381

INFO

[Watcher] "Podcasts" Ready

2024-02-28 12:53:40.770

INFO

[Watcher] "Comedy" Ready

2024-02-28 12:53:40.940

INFO

[Watcher] "non-fiction" Ready

2024-02-28 12:53:40.981

INFO

[Watcher] "History & Biography" Ready

2024-02-28 12:53:42.313

INFO

[Watcher] "Fiction" Ready

2024-02-28 13:00:00.649

INFO

[LibraryScanner] Library metadata precedence changed since last scan. From [folderStructure,audioMetatags,txtFiles,opfFile,absMetadata] to [audioMetatags,txtFiles,opfFile,absMetadata,folderStructure]

2024-02-28 13:00:00.651

INFO

[LibraryScanner] Starting (forced) library scan 05246d51-dacf-4a88-aafd-9b90c897b4de for Fiction

2024-02-28 13:00:20.746

ERROR

[parseEpubMetadata] Failed to extract META-INF/container.xml from epub at "/audiobooks/Author Name/Redacted The Step By Step Guide To Redacted by Works, Companion [Works, Companion]/Redacted The Step By Step Guide To Redacted by Works, Companion [Works, Companion].epub" Error: ENOENT: no such file or directory, open '/audiobooks/Author Name/Redacted The Step By Step Guide To Redacted by Works, Companion [Works, Companion]/Redacted The Step By Step Guide To Redacted by Works, Companion [Works, Companion].epub'

2024-02-28 13:00:20.749

FATAL

[Server] Unhandled rejection: Error: ENOENT: no such file or directory, open '/audiobooks/Author Name/Redacted The Step By Step Guide To Redacted by Works, Companion [Works, Companion]/Redacted The Step By Step Guide To Redacted by Works, Companion [Works, Companion].epub', promise: Promise { [Error: ENOENT: no such file or directory, open '/audiobooks/Author Name/Redacted The Step By Step Guide To Redacted by Works, Companion [Works, Companion]/Redacted The Step By Step Guide To Redacted by Works, Companion [Works, Companion].epub'] { errno: -2, code: 'ENOENT', syscall: 'open', path: '/audiobooks/Author Name/Redacted The Step By Step Guide To Redacted by Works, Companion [Works, Companion]/Redacted The Step By Step Guide To Redacted by Works, Companion [Works, Companion].epub' } }

Steps to reproduce the issue

I'm not interacting with app or server when this happens. Or I don't think I am? I may have opened the app at those times? Will add to this if that turns out to be the case.

Audiobookshelf version

v2.8.0

How are you running audiobookshelf?

Docker

sveinbjornpalsson commented 4 months ago

I've set log level to debug. First thing I did was to run docker-compose, but I don't think it upgraded, I think it was already on latest release (2.8.0) I just opened the ios client, it seemed to work, it's still up.

nichwall commented 4 months ago

So there used to be a file at that location, but then you moved it and now it's crashing the server?

sveinbjornpalsson commented 4 months ago

So there used to be a file at that location, but then you moved it and now it's crashing the server?

Funny thing is, this crash is recent, but those changes were made quite a while ago.

nichwall commented 4 months ago

Does the book exist in your ABS library (shows up without scanning)? If so, does it show up as missing or let you interact with it?

sveinbjornpalsson commented 4 months ago

No. I searched for it when I saw the error. No registration of it. Which is also strange, I haven’t deleted anything. Or I don’t think I have?

On Wed, 28 Feb 2024 at 13:55, Nicholas W @.***> wrote:

Does the book exist in your ABS library (shows up without scanning)? If so, does it show up as missing or let you interact with it?

— Reply to this email directly, view it on GitHub https://github.com/advplyr/audiobookshelf/issues/2686#issuecomment-1969037473, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABY7EL3W5YXPWQZRNRIBTY3YV4ZOJAVCNFSM6AAAAABD6ARPHKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNRZGAZTONBXGM . You are receiving this because you authored the thread.Message ID: @.***>

mikiher commented 4 months ago

There's another thing I don't understand about the logs above

[LibraryScanner] Starting (forced) library scan 05246d51-dacf-4a88-aafd-9b90c897b4de for Fiction

Who initiated the forced library scan? As far as I understand, the force-rescan option doesn't even exist anymore on the client in recent versions, it's only available through the API. The round hour (13:00:00) on the log entry maybe suggests that the scan was started by the CronManager (although AFAIK the CronManager does not start a forced scan either).

sveinbjornpalsson commented 4 months ago

It scans every hour according to settings. I can't remember if I pressed the scan button in Settings > Libraries. Here's the log, hope hit helps? It's been stable since I started this thread.

crash_logs.txt

mikiher commented 4 months ago

Sorry for nagging again, but the unfortunately crash the logs didn't provide me with much additional info.

Can you please also attach the crashed library scan log? It should be at /logs/scans/2024-02-28_05246d51-dacf-4a88-aafd-9b90c897b4de.txt

Hopefully it will provide me with some more info regarding what happened right before the crash.

Also, since the crash most probably happens during an automated scan, then as a workaround until we figure the root cause, you can disable the automatic scans for that library. This will most likely stop the recurring crashes.

Also, if the folders for this library are local (i.e. not network mounted), then the watcher is capable of detecting new additions to the library, and there's no need for an automated scan at all. However if the library folders are remote, you indeed need the scan, since watchers will likely not detect changes.

geminica commented 4 months ago

I have a similar issue. I moved some epubs to a different directory and now scan crashes. The logs/scan file doesn't show the errors; similar to the OP I get the logged error about failure to extract with "no such file or directory" and then it becomes FATAL immediately after. I haven't finished setting up a dev environment but suspect that on this line, the containerJson can be null and would need to be containerJson? in order to avoid failure?

https://github.com/advplyr/audiobookshelf/blob/0d9d2fa4be9b642f0948e52ddff71ff370fd524d/server/utils/parsers/parseEpubMetadata.js#L73

mikiher commented 4 months ago

Thanks for the info.

I will try to reproduce the way you described.

Regarding containerJson - you're right that it needs to be checked before accessing it, but accessing a null reference doesn't produce an exception like the one we see. However, this means that the code hasn't even reached that point, which means the fatal failure is happening somewhere right before that, but after the ERROR. That narrows it down quite a bit.

mikiher commented 4 months ago

Ah, great, I was able to reproduce this.

The crash happens during a library scan when all the follwoing conditions are met:

  1. An epub is moved from one folder to another within the same library
  2. force_rescan is true (which, after reading the code more carefully, can happen, but only after the metadata precedence of a library was changed, as can be seen in the original crash log).

Due to a bug somewhere, although the book path changes, the code tries to open it in its original location, which likely causes the crash (I haven't identified exactly where the crash happens, but at least I understand the root of the problem).

I will work on a fix.

sveinbjornpalsson commented 4 months ago

Ah, great, I was able to reproduce this.

So you don't need the log? I have found it but I'd be happy to skip it :)

mikiher commented 4 months ago

So you don't need the log? I have found it but I'd be happy to skip it :)

No, thank you, I don't need the log anymore

mikiher commented 4 months ago

@advplyr, I dug into what happens when you move library files into a different folder in the same library.

It looks like the code that handles these kind of changes seems quite buggy at this point, so I'd like to understand better the considerations behind the organization of the data in LibraryItem records and their associated media (Book/Podcast/PodcastEpisode) records, before I implement a fix.

The paths in libraryFiles need to be in sync with the paths in the various media records, otherwise bad things happen.

In this case, inside LibraryScanner.scanLibrary, when a book directory is moved from one folder to another, the scanner cleverly matches the existingLibraryItem with its new location, and then calls checkLibraryItemData, which updates existingLibraryItem and saves it in the database too. However, the associated media records are not updated, which corrupts the databsae and causes subsequent crashes.

So, my questions, before trying to fix:

mikiher commented 4 months ago

Of course, another possibility of fixing this is to remove the clever ino matching code, in which case a folder move would just be identified as a missing book and a new book. I understand you'd rather keep the existing behavior - it's very nice, and takes burden off the user (if it works :)

And in any case, we also need to fix the existing database corruption for users that encountered this bug.

advplyr commented 4 months ago

Thanks for digging into that. The issue with the data model you pointed out is something I've wanted to solve for a while now.

Originally when transitioning to sqlite from the JSON db I mapped out a data model that associated audioFiles, ebooks and images with library files but the queries became too complex while we still need to map everything to the old data model. I'm open to ways to bridge that gap though since changing the data model at this point will be a large project.

When I re-wrote the scanner this last time I must have missed updating the other file paths because that used to be handled.

Removing the inode check would cause a lot of issues since users would lose their media progress when renaming a folder.

mikiher commented 4 months ago

Thanks for digging into that. The issue with the data model you pointed out is something I've wanted to solve for a while now.

Originally when transitioning to sqlite from the JSON db I mapped out a data model that associated audioFiles, ebooks and images with library files but the queries became too complex while we still need to map everything to the old data model. I'm open to ways to bridge that gap though since changing the data model at this point will be a large project.

The approach I was thinking about is the following:

I understand doing this is a serious piece of work, but this is also quite a serious issue. I'm willing to look into this, if you agree. I can also just fix this specific issue locally, which is proably somewhat easier. Please let me know which you prefer.

When I re-wrote the scanner this last time I must have missed updating the other file paths because that used to be handled.

Removing the inode check would cause a lot of issues since users would lose their media progress when renaming a folder.

Yes, I understand that's not an ideal solution. They would also lose any metadata changes they made.

advplyr commented 4 months ago
  • In LibraryItem.libraryFiles, keep only files that are otherwise unmapped to media (I actually am not sure if files unmapped to media need to be kept at all)

Right now if you have multiple cover images you can switch between them in the Cover tab. Library files are used for those other covers. Multiple ebook files are also supported so you can have supplementary material and switch to which is the primary ebook. Library files are used for supplementary ebooks. It would be better if books supported an array of ebooks instead of just one but that is a bigger change we would have to break down.

I like your approach and would prefer that but like you said it is a large project. Excluding media files from the library files array would probably require some refactoring of the mobile apps but I will have to dig into that to see.

advplyr commented 2 months ago

Fixed in v2.9.0