RemiRigal / Plex-Auto-Languages

Automated language selection for Plex TV Shows !
MIT License
428 stars 16 forks source link

Initial scan taking a long time... can we have a cache file? #35

Closed BEisem closed 2 years ago

BEisem commented 2 years ago

The newest version of Plex-Auto-Languages does an initial scan of all episodes when it launches. According to the log it should take "a few seconds." I have close to 400k episodes in my library, and this scan takes upwards of 10 minutes. Would it be possible to cache some of this information so that a full scan isn't required every time it launches?

RemiRigal commented 2 years ago

I was happy when it took less than 30 seconds with my 20k episodes library but damn, 400k episodes... I get how this is an issue for you ! You probably want to rollback to version v1.0.15 while I fix this though, the current version scans the library whenever you update episodes.

BEisem commented 2 years ago

Yeah I know, I realize I'm kind of an extreme case. But I know there are lots of other people out there with massive libraries, so it would be great to get this working for those edge cases as well.

The good news is, as far as I can tell, once the initial scan is complete, it seems to be working ok. I guess it only scans the entire library whenever an episode is updated, but not when a new episode is added?

I'm running on docker-compose, how would I roll back? Would I change the tag to :v1.0.15?

RemiRigal commented 2 years ago

A full scan is not necessary to get the list of recently added episodes, only for updated ones. But since a full scan is the only way to detect updated episodes, PlexAutoLanguages does one whenever it detects a library update on Plex's side. I will find a way to tackle this.

Yes ! If you want to rollback, use the docker image remirigal/plex-auto-languages:v1.0.15.

BEisem commented 2 years ago

Hi @RemiRigal, I hate to sound impatient, but have you had any progress with this? I've rolled back to v1.015, but I see you have been making progress with the notifications in new versions, and would love to be able to utilize those. But I can't upgrade until this scanning time is reduced somehow. Thanks so much for your hard work!

RemiRigal commented 2 years ago

No problem I understand, I'm working on it right now, I should be able to release this in the next few hours !

BEisem commented 2 years ago

Awesome, thanks so much! Can't wait to try it out.

RemiRigal commented 2 years ago

PR #38 adds a cache file for PAL. Make sure you have the correct configuration for it to work:

I hope that it will be smoother for you, let me know ! The changes are available from version v1.1.2.

BEisem commented 2 years ago

Thanks! I have it updated and running, I'll let you know how I get on with it. Awesome work!

BEisem commented 2 years ago

Btw just a minor logging issue, you might want to change this wording: Scanning all episodes from the library, this action can take a few seconds since it clearly takes a lot longer for larger libraries. You might want to say something like Scanning all episodes from the library, this action can take a few seconds for small libraries, or a lot longer for larger libraries.

RemiRigal commented 2 years ago

Sure, thanks for pointing this out !

RemiRigal commented 2 years ago

@BEisem can I close this ? Did you have any issues with the cache feature ?

BEisem commented 2 years ago

Sorry, yes it is working! I'll close this. Thank you!

BEisem commented 2 years ago

I'm sorry but I'm afraid I have to re-open this. I'm not quite sure what is going on, but after PAL has been running overnight, it no longer catches and changes new shows I watch. For example...

PAL has been running for at least a day or more, with the cache in place. I started watching a show this morning, and the captions were not correctly set, so I had to manually adjust it. This should have triggered an entry in the logs for my user, indicating that it was changing all the episodes for this show. But the log entry never showed up. So eventually I restarted PAL, and once I restarted it, the log entry appeared.

I should also mention that during this time, while watching the logs, new episodes are being added and I do see log entries for "Processing newly added episode..." So PAL definitely is still working, it just doesn't seem to work for episodes that are changed by users. It only works for new episodes being added.

I'm not sure what other information I can provide, as this doesn't really seem to be a very specific set of instructions on how to get this to happen, but please let me know if I can provide anything else. Thanks!

RemiRigal commented 2 years ago

No need to apologize, your feedback is necessary to improve PAL 👍

What version of PAL were you running ?

I should also mention that during this time, while watching the logs, new episodes are being added and I do see log entries for "Processing newly added episode..." So PAL definitely is still working, it just doesn't seem to work for episodes that are changed by users. It only works for new episodes being added.

Can you tell me if the newly added episodes were processed right after a Plex scan or during the scheduler task ?

BEisem commented 2 years ago

What version of PAL were you running ?

I'm not sure how to tell the version ... I'm running the last version released, if that helps.

Can you tell me if the newly added episodes were processed right after a Plex scan or during the scheduler task ?

The newly added episodes are being processed right after a Plex scan. I'm sure there was also a scheduler task that runs overnight, but I didn't watch them happen in real time.

I should also point out, I don't know if this has anything to do with it, but I use a really useful tool called autoscan (https://github.com/Cloudbox/autoscan). This allows partial Plex scans, so when I add new content, only the affected directories get scanned, rather than doing a full library scan every time.

BEisem commented 2 years ago

Just so you know this is still happening ... seems like something happens overnight to cause any new episodes I watched to not have their subtitles turned on, even though the rest of the show already had them turned on. I wonder if something might be happening when the scheduled run happens at 04:30? I'm going to try turning off the schedule in the yaml file, and see if the same thing happens tomorrow morning.

BEisem commented 2 years ago

Just to be clear, here are the exact steps I did (just for future reference, so I'll remember)

  1. Stopped the PAS container docker stop PlexAutoLanguages
  2. Edited yaml file to enable: false under the scheduler: section
  3. Deleted the existing cache file (just to get a clean slate)
  4. Restarted PAS container.
BEisem commented 2 years ago

Ok so, 24 hours later and I have some new things to report.

Turning off the scheduler did not help. The problem is still occurring.

I realized maybe I didn't explain exactly what I'm seeing, so I'll try to give some more details.

It seems like after PAS has been running for some time (for example 24 hours), it stops reacting to new shows being played within a series that already had subtitles turned on. So here is a list of steps to see what I am seeing.

While watching the logs, even after it has been running overnight, I can see when new episodes are added, they get log entries like the following:

2022-05-19 09:06:56,127 [INFO] [Timeline] Processing newly added episode 'Below Deck Down Under' (S01E12)
2022-05-19 09:09:16,070 [INFO] Language update: Episode: Below Deck Down Under (S01E12) | Status: New episode | Updated for all users
2022-05-19 10:01:29,600 [INFO] [Timeline] Processing newly added episode 'Married at First Sight' (S14E18)
2022-05-19 10:04:43,679 [INFO] Language update: Episode: Married at First Sight (S14E18) | Status: New episode | Updated for all users
2022-05-19 10:04:45,378 [INFO] [Timeline] Processing newly added episode 'Derry Girls' (S03E07)
2022-05-19 10:07:14,888 [INFO] [Status] Processing newly added episode 'MasterChef Australia' (S14E24)
2022-05-19 10:10:45,429 [INFO] Language update: Episode: MasterChef Australia (S14E24) | Status: New episode | Updated for all users
2022-05-19 10:22:34,901 [INFO] [Timeline] Processing newly added episode 'Location, Location, Location' (S11E01)
2022-05-19 10:24:44,845 [INFO] [Timeline] Processing newly added episode 'Location, Location, Location' (S11E02)
2022-05-19 10:26:57,538 [INFO] [Timeline] Processing newly added episode 'Location, Location, Location' (S11E03)
2022-05-19 10:29:17,416 [INFO] [Status] Processing newly added episode 'Location, Location, Location' (S19E01)
2022-05-19 10:31:42,958 [INFO] [Status] Processing newly added episode 'Location, Location, Location' (S19E02)
2022-05-19 10:34:02,049 [INFO] [Status] Processing newly added episode 'Location, Location, Location' (S17E02)

So, using the show "Below Deck Down Under as an example, we watched an episode of that show this morning. Even though the other episodes of this show had previously been set up for English subtitles, this episode had not been changed by PAS. So I manually changed it to English subtitles, and started watching the show. I think I should have seen a log entry, saying that PAS changed any other episodes, but no entry appeared. Then, while watching the show, I restarted the PAS container. Immediately after restarting, the log entry appeared:

2022-05-19 10:34:25,438 [INFO] Received SIGINT or SIGTERM, stopping gracefully
2022-05-19 10:34:31,347 [INFO] Parsing config file '/config/config.yaml'
2022-05-19 10:34:31,350 [INFO] The provided configuration has been successfully validated
2022-05-19 10:34:31,675 [INFO] Successfully connected as user 'Departed69' (id: 1)
2022-05-19 10:34:32,093 [INFO] Starting alert listener
2022-05-19 10:34:33,349 [INFO] Language update: Show: Below Deck Down Under | User: Departed69 | Audio: English (EAC3 5.1) | Subtitles: English (Closed Captions) | Updated episodes: 1/12 (S01E01 - S01E12)

So as I said before, it seems that after running for a certain amount of time, it stops responding to new shows being watched. But restarting the container fixes it (temporarily).

RemiRigal commented 2 years ago

Thank you for all the information, it is extremely helpful !

It seems that you are having issues with the websocket connection between PAL and Plex. Can you give me additional information on your network setup ? Are you using a proxy between PAL and Plex or a direct connection ?

Can you try the image remirigal/plex-auto-languages:master ? I've made improvements regarding the management of the websocket connection, it might help finding the source of the problem. And if you can enable the debug mode it would be great, if it's a concern to you it will not generate an unreasonable amount of logs.

BEisem commented 2 years ago

Hi, thanks for the reply. I am trying the :master image, and I have turned on debug mode.

About my network setup... PAL is running in Docker, and Plex is running natively. In the docker-compose, I am using network-mode: host. In fact, here is the section of my docker-compose for PAL, in case anything is out of place:

# Plex Auto Languages
  plexautolanguages:
    image: remirigal/plex-auto-languages:master
    container_name: PlexAutoLanguages
    environment:
      - TZ=$TZ
    volumes:
      - $DOCKERDIR/plexautolanguages/config:/config
    network_mode: host
    restart: unless-stopped

$TZ and $DOCKERDIR are of course env variables.

And in the PAL config file, I'm using url: "http://localhost:32400" for Plex.

BEisem commented 2 years ago

Update after running this new version for 24 hours and having debug logs on...

The same thing happened this morning. According to the logs, it seemed to be relatively idle. I started playing something at around 9:36 or 9:37, changed the subtitles to English, but PAL did not respond with anything in the logs. So at that point, I restarted PAL, and afterwards, it immediately responded. I'm attaching the logs around that time so you can see what it was doing.

2022-05-21 09:01:15,635 [DEBUG] [Status] The Plex server scanned the library
2022-05-21 09:01:16,406 [DEBUG] [Status] The Plex server scanned the library
2022-05-21 09:01:17,920 [DEBUG] [Status] The Plex server scanned the library
2022-05-21 09:01:18,691 [DEBUG] [Status] The Plex server scanned the library
2022-05-21 09:01:19,382 [DEBUG] [Status] The Plex server scanned the library
2022-05-21 09:03:22,249 [DEBUG] [Status] The Plex server scanned the library
2022-05-21 09:07:40,914 [DEBUG] [Status] The Plex server scanned the library
2022-05-21 09:38:51,266 [INFO] Received SIGINT or SIGTERM, stopping gracefully
2022-05-21 09:38:51,376 [DEBUG] [Cache] Saving server cache to file
2022-05-21 09:38:53,598 [INFO] Parsing config file '/config/config.yaml'
2022-05-21 09:38:53,602 [INFO] The provided configuration has been successfully validated
2022-05-21 09:38:53,602 [DEBUG] Debug mode enabled
2022-05-21 09:38:53,874 [INFO] Successfully connected as user 'Departed69' (id: 1)
2022-05-21 09:38:53,874 [DEBUG] [Cache] Loading server cache from file
2022-05-21 09:38:54,242 [INFO] Starting alert listener
2022-05-21 09:39:03,250 [DEBUG] [Play Session] Session: 1122 | State: 'buffering' | User id: 1 | Episode: <Episode:658711:Welcome-to-Plathvill-s04e01>
2022-05-21 09:39:03,330 [DEBUG] [Language Update] Checking language update for show <Show:121643:Welcome-to-Plathvill> and user 'Departed69' based on episode <Episode:658711:Welcome-to-Plathvill-s04e01>
2022-05-21 09:39:03,706 [DEBUG] [Language Update] Performing 25 change(s) for show <Show:121643:Welcome-to-Plathvill>
2022-05-21 09:39:03,706 [DEBUG] [Language Update] Updating subtitle stream of episode <Episode:121667:Welcome-to-Plathvill-s01e02> to <SubtitleStream:6545160:SDH>
2022-05-21 09:39:03,710 [DEBUG] [Language Update] Updating subtitle stream of episode <Episode:121670:Welcome-to-Plathvill-s01e03> to <SubtitleStream:6545163:SDH>
2022-05-21 09:39:03,713 [DEBUG] [Language Update] Updating subtitle stream of episode <Episode:121671:Welcome-to-Plathvill-s01e04> to <SubtitleStream:6545166:SDH>
2022-05-21 09:39:03,717 [DEBUG] [Language Update] Updating subtitle stream of episode <Episode:121672:Welcome-to-Plathvill-s01e05> to <SubtitleStream:6545169:SDH>
2022-05-21 09:39:03,721 [DEBUG] [Language Update] Updating subtitle stream of episode <Episode:121673:Welcome-to-Plathvill-s01e06> to <SubtitleStream:6545172:SDH>
2022-05-21 09:39:03,726 [DEBUG] [Language Update] Updating subtitle stream of episode <Episode:121656:Welcome-to-Plathvill-s02e01> to <SubtitleStream:6545131:SDH>
2022-05-21 09:39:03,731 [DEBUG] [Language Update] Updating subtitle stream of episode <Episode:121657:Welcome-to-Plathvill-s02e02> to <SubtitleStream:6545134:SDH>
2022-05-21 09:39:03,734 [DEBUG] [Language Update] Updating subtitle stream of episode <Episode:121658:Welcome-to-Plathvill-s02e03> to <SubtitleStream:6545137:SDH>
2022-05-21 09:39:03,739 [DEBUG] [Language Update] Updating subtitle stream of episode <Episode:121659:Welcome-to-Plathvill-s02e04> to <SubtitleStream:6545140:SDH>
2022-05-21 09:39:03,744 [DEBUG] [Language Update] Updating subtitle stream of episode <Episode:121660:Welcome-to-Plathvill-s02e05> to <SubtitleStream:6545143:SDH>
2022-05-21 09:39:03,749 [DEBUG] [Language Update] Updating subtitle stream of episode <Episode:121661:Welcome-to-Plathvill-s02e06> to <SubtitleStream:6545146:SDH>
2022-05-21 09:39:03,753 [DEBUG] [Language Update] Updating subtitle stream of episode <Episode:121662:Welcome-to-Plathvill-s02e07> to <SubtitleStream:6545149:SDH>
2022-05-21 09:39:03,758 [DEBUG] [Language Update] Updating subtitle stream of episode <Episode:121663:Welcome-to-Plathvill-s02e08> to <SubtitleStream:6545152:SDH>
2022-05-21 09:39:03,763 [DEBUG] [Language Update] Updating subtitle stream of episode <Episode:121664:Welcome-to-Plathvill-s02e09> to <SubtitleStream:6545155:SDH>
2022-05-21 09:39:03,768 [DEBUG] [Language Update] Updating subtitle stream of episode <Episode:121645:Welcome-to-Plathvill-s03e01> to <SubtitleStream:6545095>
2022-05-21 09:39:03,775 [DEBUG] [Language Update] Updating subtitle stream of episode <Episode:121646:Welcome-to-Plathvill-s03e02> to <SubtitleStream:6545098>
2022-05-21 09:39:03,783 [DEBUG] [Language Update] Updating subtitle stream of episode <Episode:121648:Welcome-to-Plathvill-s03e04> to <SubtitleStream:6545104>
2022-05-21 09:39:03,789 [DEBUG] [Language Update] Updating subtitle stream of episode <Episode:121649:Welcome-to-Plathvill-s03e05> to <SubtitleStream:6545107>
2022-05-21 09:39:03,792 [DEBUG] [Language Update] Updating subtitle stream of episode <Episode:121650:Welcome-to-Plathvill-s03e06> to <SubtitleStream:6545110>
2022-05-21 09:39:03,796 [DEBUG] [Language Update] Updating subtitle stream of episode <Episode:121651:Welcome-to-Plathvill-s03e07> to <SubtitleStream:6545113>
2022-05-21 09:39:03,801 [DEBUG] [Language Update] Updating subtitle stream of episode <Episode:121652:Welcome-to-Plathvill-s03e08> to <SubtitleStream:6545116>
2022-05-21 09:39:03,805 [DEBUG] [Language Update] Updating subtitle stream of episode <Episode:121653:Welcome-to-Plathvill-s03e09> to <SubtitleStream:6545119>
2022-05-21 09:39:03,811 [DEBUG] [Language Update] Updating subtitle stream of episode <Episode:121654:Welcome-to-Plathvill-s03e10> to <SubtitleStream:6545122>
2022-05-21 09:39:03,815 [DEBUG] [Language Update] Updating subtitle stream of episode <Episode:448929:Welcome-to-Plathvill-s03e11> to <SubtitleStream:6545125>
2022-05-21 09:39:03,820 [DEBUG] [Language Update] Updating subtitle stream of episode <Episode:450883:Welcome-to-Plathvill-s03e12> to <SubtitleStream:6545128>
2022-05-21 09:39:03,824 [INFO] Language update: Show: Welcome to Plathville | User: Departed69 | Audio: English (AC3 5.1) | Subtitles: English (SRT) | Updated episodes: 25/28 (S01E01 - S04E01)

I guess I could enter a cron to restart PAL every 6 hours or so, which would be a bandaid of sorts. It would fix my problem, but wouldn't fix the underlying issue, whatever it might be.

RemiRigal commented 2 years ago

What surprises me is that PAL was able to catch Plex scans at least until 9:07, which means that the websocket connection was working at that time. Unless an issue happened between 9:07 and 9:36 when you started watching your show, the websocket might not be the source of the problem. Just to be sure, can you try to perform a Plex scan (with autoscan eventually) when you notice that the play sessions are no longer being detected by PAL. Do you still have the log [Status] The Plex server scanned the library then ?

It will probably work with a cron yes, but I would rather fix this issue ! We just need to find what's causing it.

BEisem commented 2 years ago

Ok so the next time I see play sessions are no longer being detected, I will run a Plex scan. So just to be clear, every time Plex runs a scan of any kind, I should see [Status] The Plex server scanned the library in the logs. Is that correct?

RemiRigal commented 2 years ago

Absolutely, you are correct.

BEisem commented 2 years ago

Ok so, some new information... I am still seeing several Plex scan items in the log, and yet it is not registering my Play or Pause actions.

But here's something that is very weird... it seems like it might be working for other users on my server.

2022-05-24 09:40:59,074 [DEBUG] [Status] The Plex server scanned the library
2022-05-24 09:41:15,488 [DEBUG] [Status] The Plex server scanned the library
2022-05-24 09:41:34,983 [DEBUG] [Status] The Plex server scanned the library
2022-05-24 09:41:36,733 [DEBUG] [Play Session] Session: 181 | State: 'paused' | User id: 3647819 | Episode: <Episode:401691:The-Flash-(2014)-s06e13>
2022-05-24 09:41:37,392 [DEBUG] [Language Update] Checking language update for show <Show:401658:The-Flash-(2014)> and user 'redacted' based on episode <Episode:401691:The-Flash-(2014)-s06e13>
2022-05-24 09:41:41,062 [DEBUG] [Language Update] No changes to perform for show <Show:401658:The-Flash-(2014)> and user 'redacted'
2022-05-24 09:41:44,438 [DEBUG] [Play Session] Session: 181 | State: 'playing' | User id: 3647819 | Episode: <Episode:401691:The-Flash-(2014)-s06e13>
2022-05-24 09:42:16,423 [DEBUG] [Status] The Plex server scanned the library
2022-05-24 09:42:39,790 [DEBUG] [Status] The Plex server scanned the library
2022-05-24 09:43:05,669 [DEBUG] [Status] The Plex server scanned the library
2022-05-24 09:45:15,832 [DEBUG] [Status] The Plex server scanned the library

This is the latest set of log entries. I tried playing something at 9:42-9:43. I changed the captioning, started playing, played and paused several times, and none of them registered. Then at 9:45, another Plex scan happened, and that did register.

This time, I did not restart PAL yet, because I wanted to see if you want me to do anything else or run any other tests while it is in this state. I'll also keep an eye on it to see if other user's play or stop actions will create a new log entry.

BEisem commented 2 years ago

Ok yeah, so this is very interesting ... it only seems to be affecting my main user account and my wife's managed account, both in the Home group. But other accounts that are shared accounts (not in the Home) don't seem to be affected.

More log entries have come in from other people playing, and it is registering those. But as I continue to try and play from the main account or my wife's managed account, log entries do not appear.

2022-05-24 09:52:59,626 [DEBUG] [Play Session] Session: 180 | State: 'stopped' | User id: 53013128 | Episode: <Episode:661660:Better-Call-Saul-s06e07>
2022-05-24 09:52:59,626 [DEBUG] [Play Session] End of session 180 for user 53013128
2022-05-24 09:52:59,770 [DEBUG] [Language Update] Checking language update for show <Show:267025:Better-Call-Saul> and user 'redacted' based on episode <Episode:661660:Better-Call-Saul-s06e07>
2022-05-24 09:53:00,935 [DEBUG] [Language Update] No changes to perform for show <Show:267025:Better-Call-Saul> and user 'redacted'
RemiRigal commented 2 years ago

Good news ! I think I have identified the source of the problem, which is not related to the websocket at all... There are situations where the user could not be identified during a play session because the request was done too soon. Latest commit should fix the issue.

Can you try again the remirigal/plex-auto-languages:master image and let me know ?

BEisem commented 2 years ago

Can you try again the remirigal/plex-auto-languages:master image and let me know ?

Great! I just updated, and I'll let you know what happens. Thanks for the fast fixes!

BEisem commented 2 years ago

I do believe this is working! I have been hesitant to close this issue, because it still seems like every once in awhile, my play sessions are still not recognized. But it is much more rare than before, and I can't quite determine when or how it happens.

One thing I've noticed is, since I have a large number of shared users, whenever a new episode is added, it takes quite some time for it to check and update the language for every user and every episode. If I add an entire TV show, this can really back things up quite a bit. With this in mind, it's possible I was seeing this as my play sessions not being recognized, when in fact it was just a large queue of new episodes that had to be cleared before it got to my play session.

Questions... is it possible to change the language for an entire show at a time, rather than episode by episode? And secondly, is it possible to do these things in parallel? For example maybe use one thread for updates with new episodes, and a separate thread to handle play/stop sessions? This way, the two can be happening simultaneously, rather than having a queue that has to get through first.

BEisem commented 2 years ago

I'm still having the issue with PAL sometimes not recognizing when I start playing something. The logs do not show the play command, and therefore it doesn't register the changed language. Restarting PAL causes it to be recognized. Any ideas why this could be happening?

RemiRigal commented 2 years ago

Hi @BEisem, sorry for the huge delay. Your previous analysis was on point, you have a large number of users so you are experiencing this issue because the huge amount of requests to the Plex server are blocking the thread of the websocket.

This is definitely something that needs to be fixed, especially for power users like you ! If you are ok we can close this issue and proceed on #51 to keep things organized.

Thank you again for being a beta tester, your feedbacks are extremely valuable.

BEisem commented 2 years ago

Sounds good, I'll go ahead and close it, assuming my issue is related to the plex.tv connection. Once we get that issue resolved, if I am still experiencing this, I'll reopen it.