terrelsa13 / MUMC

Multi-User Media Cleaner aka MUMC (pronounced Mew-Mick) will go through movies, tv episodes, audio tracks, and audiobooks in your Emby/Jellyfin libraries deleting media items you no longer want.
GNU General Public License v3.0
92 stars 6 forks source link

Feature Request: Narrow Down Filter Statements (by blacktag perhaps?) #82

Closed twicehopped closed 11 months ago

twicehopped commented 1 year ago

Thank you very much for this project. I would love to make my usage of it more efficient by narrowing down the files MUMC searches over each time it runs.

My use case is this: I keep all "normal" TV show episodes, but would like to only keep the last 10 daily TV show episodes.

I currently accomplish this by having all libraries set to whitelist, and by blacktagging those shows that should only keep 10 episodes (in conjunction with the minimum_number_episodes setting).

However, MUMC searches over my entire library each time it runs, flagging >99% of my episodes as "KEEPING". I would love to be able to narrow down what it searches over to enhance performance.

terrelsa13 commented 1 year ago

There are some settings that force the script to query all media (both played and unplayed).

Please attach your mumc_config.py. It may be the settings.

twicehopped commented 1 year ago

That's the thing - I want it to keep the latest 5 episodes of a show whether anyone has watched them or not.

terrelsa13 commented 1 year ago

Please attach your mumc_config.py so I can see if there is anything in the config that can be done to help either make things more efficient or to speed things up.

terrelsa13 commented 1 year ago

@twicehopped Ok, after reading this a few more times I think I understand a little better what you are asking for. You want the script to only query for blacktagged media.

This might be possible in v5 of the script. I have to determine how much extra work that will be. If the script allows blacktag filtering only, i also have to do the same for whitetags, favorites, blacklists, and whitelists. I am on the fence about how much value this brings to a script meant to be run during an off-peak time.

May I ask what on your side is making the filtering efficiency a noticeable issue?

On my setup; console output, post-processing, and debug are the bottlenecks.

I know this is not what you are asking for:

Try setting print_episode_keep_info=False. OR Try setting both print_episode_delete_info=False and print_episode_keep_info=False

twicehopped commented 1 year ago

I like the idea of print_episode_keep_info=False, since the vast majority of my media will be kept. I will incorporate this into my script.

There are currently over 10,000 files the script is looping over, only to keep 99.9%+ of them. I'd like it to only loop over the sub-directories or blacktagged items of my choosing and skip the rest of my permanent library. That's the biggest bottleneck to me. The script actually will not finish currently because of an unrelated bug that I will post in a separate github issue and upload my config file there.

Another option would be to allow for a condition days inequality in the original filter statements - what do you think about this? This would allow me to only search over media that was created in the past 30 days, for example, to determine the episodes to delete. This isn't perfect as it probably wouldn't delete the ends of daily shows that have breaks longer than 30 days between seasons.

terrelsa13 commented 1 year ago

The issue preventing the script from finishing should be solved. Let me know if you run into any other issues.

Assuming the script is now able to finish, let's revisit these two comments:

My use case is this: I keep all "normal" TV show episodes, but would like to only keep the last 10 daily TV show episodes.

That's the thing - I want it to keep the latest 5 episodes of a show whether anyone has watched them or not.

I want to make sure I understand what you are trying to do...

  1. You want to keep the last 10 Daily TV Episodes. Do you want to keep the last 10 unwatched, the last 10 watched, or the last 10 episodes regardless of watched/unwatched state?
  2. The bold parts of your comments above have confused me. At first you say you want to "keep all normal TV show episodes". But then you say you want to "keep the last 5 episodes whether anyone has watched them or not." Will you help me understand what you are looking to do?
twicehopped commented 1 year ago
  1. I only care about last 10 episodes (of certain shows) regardless of watched/unwatched. I imagine others would have a use case for only deleting watched shows though - but not my use case.

  2. Sorry - I'm separating "normal" TV shows that are likely to get rewatched over time vs. "daily" TV shows that no one is likely to ever want to rewatch - like game shows (i.e., Jeopardy) or talk shows (i.e., Late Night). My library contains both types of TV shows. Other apps utilize a similar distinction for these like sonarr.

terrelsa13 commented 11 months ago

@twicehopped My apologies. Life and work have been busy. Anytime I can find to work on this script I have been putting into updates and improvements to v5.

  1. In general for this to work; set the played count to 0. Then set played inequality to >=. Last sent minimum_episodes to 10. You will then need to determine which setting to use for minimum_episodes_behavior.

  2. Emby and Jellyfin classify the children of "normal tv series'" and "daily tv series'" as "seasons and episodes". The script will not look two levels up to determine if an individual episode belongs to a normal or daily tv series. The best way to handle this is to put the normal and daily tv shows in two different libraries OR in separate folders within the same library. Then only point the script at the library with the daily tv shows in it.

Keep in mind, if the separate folders in the same library approached is used they will both have the same libraryId. This means the matching behavior setting needs to be changed from byId to byPath or byNetworkPath.