advplyr / audiobookshelf

Self-hosted audiobook and podcast server
https://audiobookshelf.org
GNU General Public License v3.0
6.26k stars 437 forks source link

[Bug]: JavaScript heap out of memory #3338

Open mpclepto opened 3 weeks ago

mpclepto commented 3 weeks ago

What happened?

ABS is crashing roughly 2-5 times per day with stated error

What did you expect to happen?

No crash, stable uptime

Steps to reproduce the issue

  1. Install ABS
  2. Have library
  3. Wait...

Audiobookshelf version

v2.12.3

How are you running audiobookshelf?

Other (list in "Additional Notes" box)

What OS is your Audiobookshelf server hosted from?

Other (list in "Additional Notes" box)

If the issue is being seen in the UI, what browsers are you seeing the problem on?

None

Logs

<--- Last few GCs --->

[7:0x14e7c84fe010]  3575466 ms: Scavenge 4058.4 (4114.5) -> 4058.2 (4125.3) MB, 6.66 / 0.00 ms  (average mu = 0.458, current mu = 0.311) allocation failure; 
[7:0x14e7c84fe010]  3575521 ms: Scavenge 4065.0 (4125.3) -> 4065.5 (4126.0) MB, 8.98 / 0.00 ms  (average mu = 0.458, current mu = 0.311) allocation failure; 
[7:0x14e7c84fe010]  3576054 ms: Scavenge 4065.9 (4126.0) -> 4065.0 (4148.0) MB, 530.49 / 0.00 ms  (average mu = 0.458, current mu = 0.311) allocation failure; 

<--- JS stacktrace --->

FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
----- Native stack trace -----

Warning: connect.session() MemoryStore is not
designed for a production environment, as it will leak
memory, and will not scale past a single process.
Config /config /metadata
[2024-08-25 10:00:58.027] INFO: === Starting Server ===
[2024-08-25 10:00:58.029] INFO: [Server] Init v2.12.3
[2024-08-25 10:00:58.030] INFO: [Server] Node.js Version: v20.16.0
[2024-08-25 10:00:58.030] INFO: [Server] Platform: linux
[2024-08-25 10:00:58.031] INFO: [Server] Arch: x64

Additional Notes

unRaid 6.12.6 ABS from unRaid approved apps list

viskanic commented 3 weeks ago

I'm getting the same crash 😞

Docker image version: 2.12.3 Unraid version: 6.12.13

I've checked the location /metadata/logs/crash_logs.txt, but no file was generated.

advplyr commented 3 weeks ago

How big is your library?

mpclepto commented 2 weeks ago

How big is your library?

Items - 283 Authors - 179 Size (gb) - 93.1 Audio Tracks - 3542

Also ditto on what viskanic said, no useful information in /metadata/logs/crash_logs.txt all though the file does exist.

viskanic commented 2 weeks ago

I have two libraries:

Podcasts: Items in Library: 21 Size (GB): 514.9 Audio Tracks: 5,031

Books: Items in Library: 18 Size (GB): 8.3 Audio Tracks: 2,942

I've added the --restart unless-stopped to my docker configuration, so it will restart the container if it crashed.

I've also switched to the debug log level under settings. What I've noticed today is that I'm getting a === Starting Server === message at the beginning of each hour.

Update: The crash seem to be connected with the cron jobs that are enabled on each podcast to check for updated each hour. I still can't confirm it, but I've updated each podcast to run the cron job only once per day. They are all scheduled to run at the same time, so we'll see what happens.

I'll write back as soon as I know more.

mpclepto commented 2 weeks ago

I have two libraries:

Podcasts: Items in Library: 21 Size (GB): 514.9 Audio Tracks: 5,031

Books: Items in Library: 18 Size (GB): 8.3 Audio Tracks: 2,942

I've added the --restart unless-stopped to my docker configuration, so it will restart the container if it crashed.

I've also switched to the debug log level under settings. What I've noticed today is that I'm getting a === Starting Server === message at the beginning of each hour.

Update: The crash seem to be connected with the cron jobs that are enabled on each podcast to check for updated each hour. I still can't confirm it, but I've updated each podcast to run the cron job only once per day. They are all scheduled to run at the same time, so we'll see what happens.

I'll write back as soon as I know more.

I have been digging into this more today as well and I believe mine is also tied to cron downloads, however, I don't think it is the cron specifically. I have a certain podcast that is set to check / download daily. This podcast hasn't downloaded the latest 6 episodes and so I am doing so manually, however, ABS crashes constantly while trying to download. I'm doing a single episode at a time and it can't do it, crashes before finishing each time.

advplyr commented 2 weeks ago

Your library isn't big so I'm not sure how you could be hitting memory limits. Maybe it is tied to a specific podcast RSS feed? Some podcast RSS feeds are large and there is no way to only request parts of it.

mpclepto commented 2 weeks ago

Your library isn't big so I'm not sure how you could be hitting memory limits. Maybe it is tied to a specific podcast RSS feed? Some podcast RSS feeds are large and there is no way to only request parts of it.

Unsure if unrelated, but I'm seeing a multitude of partial downloads even though there is a successful download. This is for all podcast subscriptions. This may be due to crashing while the cron's are download until finally it doesn't. For instance, pic related shows the full episode downloaded at the bottom with all the failed attempts above ABS_Crash .

advplyr commented 2 weeks ago

How many podcast crons do you have running at the same time? Can you stagger them so that only one runs at a time.

mpclepto commented 2 weeks ago

How many podcast crons do you have running at the same time? Can you stagger them so that only one runs at a time.

Not many, at most there are 4 podcasts scanning at the same time. But I just staggered everything so at worst only 2 podcasts would search at once. I only have 5 podcasts monitored in total, so not many.

viskanic commented 2 weeks ago

Hi everyone πŸ‘‹

Here is a quick update from what I've been able to observe.

I have 21 podcasts in my library, they are all scheduled to check for updates at the same time once per day, but only one is causing the crash.

The Joe Rogan Experience has almost 2200 episodes and it breaks ABS somewhere in the region between 2k - 2.2k episodes.

I've removed some of the episodes that I wasn't interested in and everything went back to normal. Now I have 2008 episodes in my library, but I'm guessing it's only a matter of time before this starts happening again.

@advplyr Could you try to pull the data from the same podcast to see if you can reproduce the crash.

Your help is much appreciated πŸ’ͺ

Cheers πŸ˜ƒ

advplyr commented 2 weeks ago

For what you just outlined I already know what the issue is and unfortunately it is an issue that requires large structural changes to the server.

The issue is that when Abs fetches a podcast from the database it is fetching the podcast and every episode. A podcast episode from the database includes the Audio File object which includes all of the data we use from that file including meta tags and chapter markers.

The data model needs to be changed so that the PodcastEpisode table does not store all of the Audio File data. The API needs to be changed to not send every podcast episode when fetching a podcast. Podcast episode requests should be paginated.

There are other areas where we also store/load too much data. The LibraryItem table stores the file metadata (inode, mtime, ctime, etc) for every file that is a part of that library item in the libraryFiles column. That includes cover images, metadata files, audio files, etc.

The groundwork required to solve these problems has already been getting done over the last year. It will take a good amount of time before the large podcast issue gets resolved but I think we're moving in the right direction.

In the meantime what I recommend if you have a large library that is running out of memory is to increase the amount of memory until we can improve our data model & API.

The max memory limit is typically 4GB but may vary depending on the system. You can set this to 6GB by passing an environment variable (documented here):

NODE_OPTIONS=--max-old-space-size=6144

I would consider that a temporary change since we should be able to significantly reduce the memory usage in the future.

viskanic commented 2 weeks ago

@advplyr Thank you for your detailed and prompt reply.

I will update the environment variable to prevent the crash :+1:

mpclepto commented 2 weeks ago

Interesting, I'm glad the other poster mentioned Joe Rogan as I have this as well and noticed the most issues here. However, I do have even larger podcasts that did not have problems. Regardless, I will also update this variable as well and monitor. Thanks!

Deekerman commented 2 weeks ago

For what you just outlined I already know what the issue is and unfortunately it is an issue that requires large structural changes to the server.

The issue is that when Abs fetches a podcast from the database it is fetching the podcast and every episode. A podcast episode from the database includes the Audio File object which includes all of the data we use from that file including meta tags and chapter markers.

The data model needs to be changed so that the PodcastEpisode table does not store all of the Audio File data. The API needs to be changed to not send every podcast episode when fetching a podcast. Podcast episode requests should be paginated.

There are other areas where we also store/load too much data. The LibraryItem table stores the file metadata (inode, mtime, ctime, etc) for every file that is a part of that library item in the libraryFiles column. That includes cover images, metadata files, audio files, etc.

The groundwork required to solve these problems has already been getting done over the last year. It will take a good amount of time before the large podcast issue gets resolved but I think we're moving in the right direction.

In the meantime what I recommend if you have a large library that is running out of memory is to increase the amount of memory until we can improve our data model & API.

The max memory limit is typically 4GB but may vary depending on the system. You can set this to 6GB by passing an environment variable (documented here):

NODE_OPTIONS=--max-old-space-size=6144

I would consider that a temporary change since we should be able to significantly reduce the memory usage in the future.

can we make it any size? 8 or 12 gb?

viskanic commented 2 weeks ago

@Deekerman I've set it to --max-old-space-size=8192 and it worked as expected πŸ‘ You can set it as high as you want as long as you have free RAM on your machine.

Deekerman commented 2 weeks ago

@Deekerman I've set it to --max-old-space-size=8192 and it worked as expected πŸ‘ You can set it as high as you want as long as you have free RAM on your machine.

@viskanic can you tell me where i find the damn file lol

mine is in pve so i can just add more ram as needed.

viskanic commented 2 weeks ago

@Deekerman There is no file.

You need to update your Docker config to include the environment variable.

Key NODE_OPTIONS Value --max-old-space-size=8192.

In Unraid it looks like this:

Screenshot 2024-09-02 at 17 15 59
Deekerman commented 2 weeks ago

oh i did it this way

but i found the file with this link lol

https://www.audiobookshelf.org/docs#linux-install-deb

@viskanic

but that still doesnt help lol

this right?

image

viskanic commented 2 weeks ago

@Deekerman

You didn't specify the key:value pair correctly.

Update the config with the following line: NODE_OPTIONS=--max-old-space-size=8192

Deekerman commented 2 weeks ago

@Deekerman

You didn't specify the key:value pair correctly.

Update the config with the following line: NODE_OPTIONS=--max-old-space-size=8192

i just realized that. didnt seem to fix anything. jre still drops off



INFO

=== Starting Server ===

2024-09-02 15:43:47.334

INFO

[Server] Init v2.13.2

2024-09-02 15:43:47.335

INFO

[Server] Node.js Version: v20.11.1

2024-09-02 15:43:47.335

INFO

[Server] Platform: linux

2024-09-02 15:43:47.335

INFO

[Server] Arch: x64

2024-09-02 15:43:47.358

INFO

[BinaryManager] Found valid binary ffmpeg at /usr/share/audiobookshelf/ffmpeg

2024-09-02 15:43:47.358

INFO

[BinaryManager] Updating process.env.FFMPEG_PATH

2024-09-02 15:43:47.375

INFO

[BinaryManager] Found valid binary ffprobe at /usr/share/audiobookshelf/ffprobe

2024-09-02 15:43:47.375

INFO

[BinaryManager] Updating process.env.FFPROBE_PATH

2024-09-02 15:43:47.376

INFO

[Database] Initializing db at "/usr/share/audiobookshelf/config/absdatabase.sqlite"

2024-09-02 15:43:47.406

INFO

[Database] Db connection was successful

2024-09-02 15:43:47.550

INFO

[Database] Db initialized with models: user, library, libraryFolder, book, podcast, podcastEpisode, libraryItem, mediaProgress, series, bookSeries, author, bookAuthor, collection, collectionBook, playlist, playlistMediaItem, device, playbackSession, feed, feedEpisode, setting, customMetadataProvider, mediaItemShare

2024-09-02 15:43:47.553

DEBUG

Set Log Level to DEBUG

2024-09-02 15:43:47.646

DEBUG

Daily Log file found 2024-08-27.txt

2024-09-02 15:43:47.646

DEBUG

Daily Log file found 2024-08-28.txt

2024-09-02 15:43:47.646

DEBUG

Daily Log file found 2024-08-29.txt

2024-09-02 15:43:47.646

DEBUG

Daily Log file found 2024-08-30.txt

2024-09-02 15:43:47.647

DEBUG

Daily Log file found 2024-08-31.txt

2024-09-02 15:43:47.647

DEBUG

Daily Log file found 2024-09-01.txt

2024-09-02 15:43:47.647

DEBUG

Daily Log file found 2024-09-02.txt

2024-09-02 15:43:47.647

INFO

[LogManager] Init current daily log filename: 2024-09-02.txt

2024-09-02 15:43:47.703

DEBUG

[BackupManager] Backup found "2024-09-01T0130"

2024-09-02 15:43:47.713

DEBUG

[BackupManager] Backup found "2024-09-02T0130"

2024-09-02 15:43:47.713

INFO

[BackupManager] 2 Backups Found

2024-09-02 15:43:47.723

DEBUG

[CronManager] Init library scan cron for Ebooks on schedule 0 0 * * 1

2024-09-02 15:43:47.783

DEBUG

[CronManager] Found 2 podcast episode schedules to start

2024-09-02 15:43:47.783

DEBUG

[CronManager] Scheduling podcast episode check cron "0 * * * *" for 15 item(s)

2024-09-02 15:43:47.784

DEBUG

[CronManager] Scheduling podcast episode check cron "0 1 * * *" for 1 item(s)

2024-09-02 15:43:47.784

INFO

[Watcher] Initializing watcher for "Audiobooks".

2024-09-02 15:43:47.785

DEBUG

[Watcher] Init watcher for library folder path "/mnt/media/audiobooks"

2024-09-02 15:43:47.785

INFO

[Watcher] Initializing watcher for "Ebooks".

2024-09-02 15:43:47.785

DEBUG

[Watcher] Init watcher for library folder path "/mnt/media/ebook"

2024-09-02 15:43:47.785

INFO

[Watcher] Initializing watcher for "Comics".

2024-09-02 15:43:47.786

DEBUG

[Watcher] Init watcher for library folder path "/mnt/media/comics"

2024-09-02 15:43:47.793

INFO

Listening on http://0.0.0.0:13378

2024-09-02 15:43:48.289

INFO

[Watcher] "Comics" Ready

2024-09-02 15:43:48.613

INFO

[Watcher] "Ebooks" Ready

2024-09-02 15:43:48.813

INFO

[Watcher] "Audiobooks" Ready

2024-09-02 15:43:51.216

DEBUG

[ApiCacheManager] count: 0 size: 0

2024-09-02 15:43:51.234

DEBUG

Loaded filterdata in 0.00s

2024-09-02 15:43:51.236

DEBUG

[ApiCacheManager] Cache miss: {"user":"root","url":"/libraries/dc8d650d-2894-41d9-8f2d-299bf7a9d7ef?include=filterdata"}

2024-09-02 15:43:51.498

DEBUG

[ApiCacheManager] count: 1 size: 1340

2024-09-02 15:43:51.511

DEBUG

[ApiCacheManager] Cache miss: {"user":"root","url":"/libraries"}

2024-09-02 15:43:51.551

INFO

[SocketAuthority] Socket Connected WTprpy8hvcxkyExTAAAB

2024-09-02 15:43:51.564

DEBUG

[SocketAuthority] User Online root```
mpclepto commented 2 weeks ago

Interesting, I'm glad the other poster mentioned Joe Rogan as I have this as well and noticed the most issues here. However, I do have even larger podcasts that did not have problems. Regardless, I will also update this variable as well and monitor. Thanks!

Just checking in to confirm that I have had no crashes the last day and that was from bumping the space to 6gb. Now, I did set my crons to be staggered the day prior, so 6gb may be enough. However, I have bumped it to 8gb in hopes that I never see the issue again.

I did so via unraid docker options like in @viskanic screenshot.

mpclepto commented 1 week ago

Just updating this thread to report 3 more days of issue free running. This appears to have entirely solved my problem. From 3-5 crashes a day without fail to zero the last several. Thanks for the help all!