Closed peter-mcconnell closed 1 year ago
Found your docker image and gave it a whirl, works like a charm, thank you so much! Been waiting for a solution to large libraries for some time :)
I didn't even think to see if peter-mcconnell had a branch. I pulled his image and can confirm it at least loads one one of my libraries. I'll try it on the rest later.
Is this available for unraid, if so how?
Edit: I discovered how to to have community apps search for all docker images
glad to hear it's working for you folks. enjoy the spare GBs
Peter you are the man, working like a charm. Over 6k in my Movies library, so the timeouts were brutal.
Wow this is amazing. Thanks for the awesome contribution @peter-mcconnell !
Not sure if this works on Synology Docker? Tried and cannot mount the /config folder. Are these changes in the latest Selexin release or is it possible to add them there?
*EDIT: I pulled your image from the Docker registry and that is the one that fails to load the /config mount path. Tried the same one used for Selexin's image and tried creating a new one. Both fail.
@roggstudelpest this PR does not change to how the config folder works, and in my testing (albeit not on synology) the new image works fine with mounting the config folder.
Can you please raise a new issue and include the Synology errors?
strange that it now works for you guys... with latest release finally series are working, but 8k in movie library is not working...
This worked great for my TV Shows (230k ep.), but didn't work on Movies(45k). If I add my Movies library it gives the "HTTPConnectionPool(host='192.168.1.200', port=32400): Read timed out. (read timeout=30)" error every time after 30 seconds no matter what I have the PLEX_TIMEOUT set to. Has there been any solutions to this?
Yeah there are undoubtedly more scenarios where timeouts are still likely. There are two high level routes to solving this problem in my head:
optimise the backend code further to get the calls down to some reasonable SLO for the average usecase. Ideally < 3s. The last time I looked at the code most of this time was due to fetching more attributes about the movies / shows that it finds, so this route may require a simplification of the UI also.
avoid the HTTP timeout issues entirely by building a little CLI that uses the same backend calls. This could be a "break glass" tool for people to run if their library is simply too heavy for the UI at present.
Option 1 of course is the neatest product solution as there is a single, reliable path for all users. However it is also the most involved from an engineering perspective and could bleed out to a more simplified UI view. Option 2 is the worst user-experience, but would be easy to build and likely be an "ok" option for people whilst waiting on Option 1.
I could try to knock up Option 2 tonight / tomorrow evening if I get some free time and will leave Option 1 to the maintainer(s)
Thank you for your response. That would be excellent. As I said in my previous comment, the TV works great and simply does exactly what I need it to do. I did two different Dockers so that I could use the TV version while trying to hash out the Movie versions issues.
@bigsteve76 I've put something rough together for you to play with: https://github.com/se1exin/Cleanarr/pull/94
This is VERY simple and rough around the edges but should provide a starting block for a CLI based approach to this product
Would you mind adding it to community applications ? I have no clue on how to add dockers from dockerhub in unraid :/ Have a library of 49 000 movies with around 2000 dupes. Would love to get rid of them :P
What does this PR do?
tl;dr; moves
while not end
to the UI and performs sections of this loop asyncronously. You can test this change withdocker pull pemcconnell/cleanarr
Updates
get_dupe_content
so that it accepts apage
parameter instead of running inside a while loop - this moves the paging iteration outside of this function, allowing us to create external paging mechanisms. It then updates theget_dupes
method so it takes an optionalpage
argument, which gets passed to theget_dupe_content
method - this moves the paging iteration to the uri level. Lastly it updates theloadDupeContent
method in the frontend code so that it will essentially be responsible for the originalwhile not end
logic. This puts an HTTP request between each paging of the results. Shouldn't be too noticable perf wise for existing usecases, but keen for validation there. I'm expecting a speed up for all usecases due to the async batching this PR introduces but I don't have a way to easily test for this. Importantly this means that Cleanarr now works on larger datasets.An optimisation added that executes dupes requests in batches of
PAGE_SIZE
asyncronously. This defaults to 10 but can be overwritten with a new envREACT_APP_PAGE_SIZE
. Cost: some needless calls to the dupes api. Reward: total time to load decreased for users in the UI. Due to this asyncronous logic this PR makes larger libraries not only work, but do so much faster (for me I went from ~90 second load time to ~15 after the async patch). 15 seconds is still way to long and I suspect 504s are still technically possible, but this is a reasonable start.What does this not do?
I made no attempt to optimise any of the existing logic or external calls, aside from the asyncronous batching of ?page=X to ?page=Y. I suspect many opportunities for perf improvements remain in breaking the dupes api call up further, caching and/or reducing the number of properties requested. I made no attempt at adding visual cues for the paging in the UI. I didn't write tests. Sorry. I have limited time and wanted to hack something together in 30 mins and am not familiar with typescript so this wasn't fluent for me. I haven't yet tested other parts of the software with my library - if I find other issues I'll try to patch and PR as I go
Caveats
I have never written typescript before. No clue if what I wrote is clean / idiosyncratic, but it seems to work.
How was it tested / how should a reviewer test it?
Scanning over the open issues I believe this will solve the following:
Closes #55 Closes #57 Closes #60 Closes #62 Closes #66 Closes #77 Closes #70