se1exin / Cleanarr

A simple UI to help find and delete duplicate and sample files from your Plex server
https://hub.docker.com/r/selexin/cleanarr
MIT License
218 stars 18 forks source link

Unable to start "Failed to load content!" #55

Closed austempest closed 1 year ago

austempest commented 2 years ago

I've just set this up for the first time and cannot get it to work for me. Also can't find a discord to ask for support.

services:

cleanarr: image: selexin/cleanarr:latest container_name: cleanarr hostname: cleanarr ports:

The container is within an LXC (ubuntu) on a proxmox node on IP 192.168.1.92

log files https://pastebin.com/2HLQZ9g9

se1exin commented 2 years ago

@austempest thanks for reporting, and for the log files!

Looks like the backend is timing out, which might be related to having many library names configured (I've only tested with 2 libraries - not as many as 6 sorry). Can you please try adding the following environment variable to your docker-compose?

PAGE_SIZE=20

As of a few days ago, Cleanarr queries Plex in pages/chunks - PAGE_SIZE determines how many media items to load per chunk. The default is 50 - so 6 x 50 = 300, which might be causing the backend process to run out of memory.

Let me know if lowering it to 20 helps at all (or even try as low as 10, 5, or 1).

austempest commented 2 years ago

Thanks for the quick reply @se1exin.

Added the variable to docker compose, no change (I added PAGE_SIZE=10), log here

Changed the env to [PAGE_SIZE=1] log here

Changed to one library [LIBRARY_NAMES=Movies]; [PAGE_SIZE=10] log here

Other pieces of info

se1exin commented 2 years ago

@austempest Thanks for trying the PAGE_SIZE changes out - looks like it's not related to that.

It could be a plex token issue, but it sounds like you are generating it correctly. Are you wrapping the token string in quotes in your docker-compose file? If it contains non-alphanumeric characters there may be some unintentional escaping or similar happening. E.g:

(BAD)
environment:
  - PLEX_TOKEN=myplextoken

(GOOD)
environment:
  - PLEX_TOKEN="myplextoken"

Also, seeing your computer's IP in webserver logs is very normal, it's logging the IP address of the computer that made the network request to the server.

austempest commented 2 years ago

@se1exin ... nope :( it's going to be something really silly

my plex token had a couple of underscores and ended in one, so i wrapped it in inverted commas, still no dice

version: '3'
services:
  cleanarr:
    image: selexin/cleanarr:latest
    container_name: cleanarr
    hostname: cleanarr
    ports:
      - 8885:80
    environment:
      - BYPASS_SSL_VERIFY=1
      - PLEX_TOKEN="h_svthemiddlepartofmytokenkrU_"
      - PLEX_BASE_URL=http://192.169.1.93:32400
      - PAGE_SIZE=10
#      - LIBRARY_NAMES=Movies;Movies Kids;TV Shows;TV Shows Kids; TV Shows Older Kids;TV Shows Solo
      - LIBRARY_NAMES=Movies
    volumes:
      - /home/cleanarr:/config
    restart: unless-stopped
se1exin commented 2 years ago

Hmmm, now I'm starting to scratch my head... You mentioned that the container is running in LXC and proxmox - that looks like an alternative setup to using docker(?). Perhaps the networking model is different and it can't reach your plex server? If LXC has a host-networking model similar to docker, I would try that - you will need to hit the server at port 80 instead of 8885, but it might be a good starting point to identify if the issue related to how networking is handled with your container stack.

P.s. I'm sure you're going to, but just in case you should revoke that plex token now that it has been posted publicly (but thanks for showing that you certainly do have it quoted properly) :)

austempest commented 2 years ago

I thought about that, but i doubt it.

The LXC is essentially a cut down VM, and the LXC has docker installed and accessible via portainer or ssh. I use portainer to create a cleanarr stack, and edit the docker-compose from there. The exact same LXC/docker/portainer interface also has radarr and sonarr installed, and both can talk to plex.

I ssh'd into the container and tried to ping the internal IP plex is on (another container in proxmox)

root@docker-servarr:~# ping 192.168.1.93
PING 192.168.1.93 (192.168.1.93) 56(84) bytes of data.
64 bytes from 192.168.1.93: icmp_seq=1 ttl=64 time=0.141 ms
64 bytes from 192.168.1.93: icmp_seq=2 ttl=64 time=0.086 ms

And thanks for the concern, but the middle part of the plex token i deleted and replaced with 'themiddlepartofmytoken', lol

se1exin commented 2 years ago

And thanks for the concern, but the middle part of the plex token i deleted and replaced with 'themiddlepartofmytoken', lol

Haha guess I just saw the random characters and assumed it was a legit token.

The exact same LXC/docker/portainer interface also has radarr and sonarr installed, and both can talk to plex.

Right, ok so definitely not networking either. Let me put together a build with some more in-depth error logging, hopefully with that we can get some more info on what is going on. And thanks for your help so-far in troubleshooting 👍🏼

austempest commented 2 years ago

Thanks.

There is one docker container i run that refuses to work in docker in a LXC (transmission / openvpn), and has to run in a docker instance installed on a full VM. Thought i'd give that a shot and spun up a new cleanarr instance on that full VM with docker. Still no luck, the exact same GUI error and here's the logs if it helps

peter-mcconnell commented 1 year ago

Same situation for me (tried it for the first time today, getting these issues). I had a quick look at what was happening and noticed ./content/dupes was throwing a 504, so isolated PlexWrapper().get_dupe_content()for testing and found this takes real 1m6.430s. My first thought is that this may need split so that the client can make N calls, rather than relying on python to do it (otherwise this method has to guarantee a total execution time within a timeout window, whilst parsing the entirety of varying dataset sizes). It'll take the same time to process (or slightly longer) but timeouts should be reliably avoided and the client experience can be enriched - I believe this is the only real way to fix this issue properly.

However, this has me curious as it seems an incredibly long period of time just to retrieve data. Taking a slightly deeper look I noticed the vast majority of time is being lost to movie_to_dict (varying levels of impact depending on the property being requested).

To demonstrate this I commented out the "shows" element of get_dupe_content (just to limit my exploration to movies) and ran some tests. The first a crude isolation and benchmark of the existing code:

root@fe1d2cbfc46e:/src# cat test_old.py 
import time
from plexwrapper import PlexWrapper

if __name__ == "__main__":
    print("simple test for old version")
    start = time.time()
    PlexWrapper().get_dupe_content()
    end = time.time()
    print(f"finished in {end-start}")
python3 test_old.py
...
finished in 34.96932053565979

Now, removing ~half of the properties:

    def movie_to_dict_new(self, movie: Movie, library: str) -> dict:
        # https://python-plexapi.readthedocs.io/en/latest/modules/video.html#plexapi.video.Movie
        return {
            # **self.video_to_dict(movie),
            "contentType": 'movie',
            "library": library,
            "duration": movie.duration,
            "guid": movie.guid,
            # "originalTitle": movie.originalTitle,
            # "originallyAvailableAt": str(movie.originallyAvailableAt),
            # "rating": movie.rating,
            # "ratingImage": movie.ratingImage,
            "studio": movie.studio,
            # "tagline": movie.tagline,
            # "userRating": movie.userRating,
            "year": movie.year,
            "media": [self.media_to_dict(media) for media in movie.media],
        }

It runs ~6x faster

finished in 5.945791244506836

I'm not sure if it's feasible that some of these properties could be cached ahead of time as part of a warmup, or avoided altogether, but thought I'd flag regardless.

For my usecase at least this problem is sadly due to a combination of how this call is architected and the size of my library.

se1exin commented 1 year ago

Wow, thanks @peter-mcconnell for taking the time to dig into this in such detail!

Given your results, there is evidently much that can be done to improve the loading of each media item. If I were to make a branch with some architecture changes would you be happy to test them out on your library?

peter-mcconnell commented 1 year ago

Sure thing, happy to guinea-pig. I could probably create a pytest that loads sample data from a text file to repro the issue also. If I get some free time this week I'll give that a shot

peter-mcconnell commented 1 year ago

I had some free time tonight so hacked something together for this which has got the UI loading correctly for me.

image

PR submitted: https://github.com/se1exin/Cleanarr/pull/84

austempest commented 1 year ago

@se1exin I pulled the image from @peter-mcconnell fork, and can confirm PR #84 fixes this issue.