ghomasHudson / Jellyfin-Auto-Collections

Automatically make jellyfin collections from IMDB, Letterboxd lists and more.
MIT License
102 stars 12 forks source link

Movie can't be found/added if year date doesn't exactly match IMDB #16

Open mikewesten opened 1 year ago

mikewesten commented 1 year ago

Example of what I mean.

$ python letterboxd_list.py
Getting collections list...

found IndieWire’s Best Plot Twists of the 21st Century 4855b1fc8721f48609d69090910e6814
************************************************

Added Memento 686380644102773d00c2d3d668e6f4c9
Added Unbreakable d8a99d6b62ecaeb12c8cd5671a421386
Added Donnie Darko ffd3847923d89d56f1edf4b17d7ccf93
Added Mulholland Drive 4fc539b540c86eccbe20ce36cca2517c
Added The Others 0e33b2fcf0b60fa4f24682be0c57bc9c
Added The Ring 9306307a9b98e96069441f8215771740
Can't find High Tension
Added Oldboy d48ec04a578c664e5ddce81a46c6c26b
Added Saw 312465bc14c3cc7d938c1583fbd819e1
Added Eternal Sunshine of the Spotless Mind 25a726b84a110f9c3d9130b0faaee535
Added The Village f7dfeceb673a2a7cab77dec4dc516ee8
Can't find Caché
Added The Descent 18ab8cea2f15da85b5679d25b3f7528d
Added The Prestige 23f37ce00f95e033baedf0d60351ba78
Can't find Atonement
Added Gone Baby Gone 2aea0d106eb8b8b46a2fc36652da7613
Added The Mist 6e23efbc9dbe53274b7064e736593c36
Can't find Orphan
Added Shutter Island 945eebe6ff817f7abd66351ed570ab80
Can't find Certified Copy
Can't find Kill List
Can't find Goodnight Mommy
Added Gone Girl 8e774449c2ab2f7458788f5474df552f
Can't find Ex Machina
Can't find The Invitation
Added Arrival 2f83d79882f331aa06bcd7b40fcc115a
Can't find Get Out
Added Us b6e26ea20acedf49da405041af34b4bc
Added Parasite 1a55a3e414caf2ce85659493da04f67d
Added Malignant 8e541060316b96628865c7cbb334d23a
Can't find The Power of the Dog
Added Barbarian 215bd5f86ad5ad532b01046c7a1b0d5a

"Ex Machina" is the movie that should have been found but wasn't. TMDB and IMDB currently have different release years, 2015 and 2014. I changed the metadata of my "Ex Machina" entry in Jellyfin from 2015 to 2014 to match IMDB's release year (Letterboxd uses IMDB's release year...sometimes...all the time?? Not quite sure.) Then if I re-run the script the movie is now found.

$ python letterboxd_list.py
Getting collections list...

found IndieWire’s Best Plot Twists of the 21st Century 4855b1fc8721f48609d69090910e6814
************************************************

Added Memento 686380644102773d00c2d3d668e6f4c9
Added Unbreakable d8a99d6b62ecaeb12c8cd5671a421386
Added Donnie Darko ffd3847923d89d56f1edf4b17d7ccf93
Added Mulholland Drive 4fc539b540c86eccbe20ce36cca2517c
Added The Others 0e33b2fcf0b60fa4f24682be0c57bc9c
Added The Ring 9306307a9b98e96069441f8215771740
Can't find High Tension
Added Oldboy d48ec04a578c664e5ddce81a46c6c26b
Added Saw 312465bc14c3cc7d938c1583fbd819e1
Added Eternal Sunshine of the Spotless Mind 25a726b84a110f9c3d9130b0faaee535
Added The Village f7dfeceb673a2a7cab77dec4dc516ee8
Can't find Caché
Added The Descent 18ab8cea2f15da85b5679d25b3f7528d
Added The Prestige 23f37ce00f95e033baedf0d60351ba78
Can't find Atonement
Added Gone Baby Gone 2aea0d106eb8b8b46a2fc36652da7613
Added The Mist 6e23efbc9dbe53274b7064e736593c36
Can't find Orphan
Added Shutter Island 945eebe6ff817f7abd66351ed570ab80
Can't find Certified Copy
Can't find Kill List
Can't find Goodnight Mommy
Added Gone Girl 8e774449c2ab2f7458788f5474df552f
Added Ex Machina dcf626b7a7b6a7937b8f0d468d61922d
Can't find The Invitation
Added Arrival 2f83d79882f331aa06bcd7b40fcc115a
Can't find Get Out
Added Us b6e26ea20acedf49da405041af34b4bc
Added Parasite 1a55a3e414caf2ce85659493da04f67d
Added Malignant 8e541060316b96628865c7cbb334d23a
Can't find The Power of the Dog
Added Barbarian 215bd5f86ad5ad532b01046c7a1b0d5a
ghomasHudson commented 1 year ago

Hmm maybe I should add a year_threshold parameter to make it a little more forgiving.

The reason the year stuff is in there is to prevent mistagging of different versions of the same movie.

mikewesten commented 1 year ago

It is very common for the year to be off by 1, since IMDB respects premiere dates whereas TMDB doesn't. A bit more rare are movies where the release year is off by several years, but that does happen sometimes too.

mikewesten commented 5 months ago

31 Was this enhancement implemented? I downloaded the latest master branch and re-ran the same list search from the OP and Ex-Machina is now being found even with the year date difference. So that's good.

Unfortunately though it falsely matched several movie items during some non-ID dependent fallback process?

Example: Us (2019) - IMDb is not on my server, so instead it matched and added Used Cars (1980) - IMDb, which is on my server. If I search movies on my jellyfin app and type in "us", then Used Cars (1980) is the first search result in the list.

2nd Example: Barbarian (2022) - IMDb is a movie I do have on my server, and it has IMDB and TMDB ID's correctly filled in. Instead it matched and added Barbarian Queen (1985) - IMDb If I search movies on my jellyfin app and type in "barbarian", then Barbarian Queen (1985) is the first search result in the list. There's a lot of "barbarian" titled, themed, and tagged movies on my server so the newer Barbarian (2022) movie is pretty far down the list of my search results.

mikewesten commented 5 months ago

$ python main.py 2024-06-13 07:54:40.957 | INFO | main::43 - Starting up 2024-06-13 07:54:40.957 | INFO | main::44 - Starting initial run 2024-06-13 07:54:41.129 | INFO | main:main:29 - 2024-06-13 07:54:41.130 | INFO | main:main:30 - 2024-06-13 07:54:41.130 | INFO | main:main:31 - Getting list info for plugin: letterboxd, list id: mitchell/list/indiewires-best-plot-twists-of-the-21st-century/ Getting collections list... 2024-06-13 07:54:41.910 | INFO | utils.jellyfin:find_collection_with_name_or_create:66 - No matching collection found for: IndieWire’s Best Plot Twists of the 21st Century. Creating new collection... 2024-06-13 07:54:42.260 | INFO | utils.jellyfin:add_item_to_collection:96 - Added Memento to collection 2024-06-13 07:54:42.621 | INFO | utils.jellyfin:add_item_to_collection:96 - Added Unbreakable to collection 2024-06-13 07:54:42.953 | INFO | utils.jellyfin:add_item_to_collection:96 - Added Donnie Darko to collection 2024-06-13 07:54:43.346 | INFO | utils.jellyfin:add_item_to_collection:96 - Added Mulholland Drive to collection 2024-06-13 07:54:43.712 | INFO | utils.jellyfin:add_item_to_collection:96 - Added The Others to collection 2024-06-13 07:54:44.460 | INFO | utils.jellyfin:add_item_to_collection:96 - Added The Ring to collection 2024-06-13 07:54:44.800 | WARNING | utils.jellyfin:add_item_to_collection:91 - Item High Tension not found in jellyfin 2024-06-13 07:54:45.159 | INFO | utils.jellyfin:add_item_to_collection:96 - Added Oldboy to collection 2024-06-13 07:54:45.513 | INFO | utils.jellyfin:add_item_to_collection:96 - Added Saw to collection 2024-06-13 07:54:45.906 | INFO | utils.jellyfin:add_item_to_collection:96 - Added Eternal Sunshine of the Spotless Mind to collection 2024-06-13 07:54:46.262 | INFO | utils.jellyfin:add_item_to_collection:96 - Added The Village to collection 2024-06-13 07:54:46.709 | INFO | utils.jellyfin:add_item_to_collection:96 - Added The Descent to collection 2024-06-13 07:54:47.065 | INFO | utils.jellyfin:add_item_to_collection:96 - Added Caché to collection 2024-06-13 07:54:48.027 | INFO | utils.jellyfin:add_item_to_collection:96 - Added The Prestige to collection 2024-06-13 07:54:48.425 | INFO | utils.jellyfin:add_item_to_collection:96 - Added Atonement to collection 2024-06-13 07:54:48.775 | INFO | utils.jellyfin:add_item_to_collection:96 - Added Gone Baby Gone to collection 2024-06-13 07:54:49.135 | INFO | utils.jellyfin:add_item_to_collection:96 - Added The Mist to collection 2024-06-13 07:54:49.503 | INFO | utils.jellyfin:add_item_to_collection:96 - Added Orphan to collection 2024-06-13 07:54:50.267 | INFO | utils.jellyfin:add_item_to_collection:96 - Added Shutter Island to collection 2024-06-13 07:54:50.603 | WARNING | utils.jellyfin:add_item_to_collection:91 - Item Certified Copy not found in jellyfin 2024-06-13 07:54:50.995 | INFO | utils.jellyfin:add_item_to_collection:96 - Added Kill List to collection 2024-06-13 07:54:51.337 | WARNING | utils.jellyfin:add_item_to_collection:91 - Item Goodnight Mommy not found in jellyfin 2024-06-13 07:54:51.647 | INFO | utils.jellyfin:add_item_to_collection:96 - Added Gone Girl to collection 2024-06-13 07:54:52.012 | INFO | utils.jellyfin:add_item_to_collection:96 - Added Ex Machina to collection 2024-06-13 07:54:52.383 | INFO | utils.jellyfin:add_item_to_collection:96 - Added The Invitation to collection 2024-06-13 07:54:52.793 | INFO | utils.jellyfin:add_item_to_collection:96 - Added Arrival to collection 2024-06-13 07:54:53.161 | INFO | utils.jellyfin:add_item_to_collection:96 - Added Get Out to collection 2024-06-13 07:54:53.984 | INFO | utils.jellyfin:add_item_to_collection:96 - Added Us to collection 2024-06-13 07:54:54.355 | INFO | utils.jellyfin:add_item_to_collection:96 - Added Parasite to collection 2024-06-13 07:54:55.217 | INFO | utils.jellyfin:add_item_to_collection:96 - Added Malignant to collection 2024-06-13 07:54:55.545 | WARNING | utils.jellyfin:add_item_to_collection:91 - Item The Power of the Dog not found in jellyfin 2024-06-13 07:54:55.916 | INFO | utils.jellyfin:add_item_to_collection:96 - Added Barbarian to collection

collection-list-screenshot

ghomasHudson commented 5 months ago

Should have enabled the year filter now. Will enable the year filter unless you specify year_filter: false in a plugin (you might want to add clear_colection one time to each to reset them).

Still need to implement the year threshold when I have time.

mikewesten commented 5 months ago

Re-tried the same as above with the master branch from today. The results are certainly different, improved somewhat but still imperfect.

$ python main.py 2024-06-13 12:58:10.371 | INFO | main::43 - Starting up 2024-06-13 12:58:10.372 | INFO | main::44 - Starting initial run 2024-06-13 12:58:10.628 | INFO | main:main:29 - 2024-06-13 12:58:10.628 | INFO | main:main:30 - 2024-06-13 12:58:10.629 | INFO | main:main:31 - Getting list info for plugin: letterboxd, list id: mitchell/list/indiewires-best-plot-twists-of-the-21st-century/ Getting collections list... 2024-06-13 12:58:11.320 | INFO | utils.jellyfin:find_collection_with_name_or_create:66 - No matching collection found for: IndieWire’s Best Plot Twists of the 21st Century. Creating new collection... 2024-06-13 12:58:11.754 | WARNING | utils.jellyfin:add_item_to_collection:110 - Item Memento not found in jellyfin 2024-06-13 12:58:12.078 | INFO | utils.jellyfin:add_item_to_collection:115 - Added Unbreakable to collection 2024-06-13 12:58:12.441 | INFO | utils.jellyfin:add_item_to_collection:115 - Added Donnie Darko to collection 2024-06-13 12:58:12.806 | INFO | utils.jellyfin:add_item_to_collection:115 - Added Mulholland Drive to collection 2024-06-13 12:58:13.830 | INFO | utils.jellyfin:add_item_to_collection:115 - Added The Others to collection 2024-06-13 12:58:14.160 | WARNING | utils.jellyfin:add_item_to_collection:110 - Item The Ring not found in jellyfin 2024-06-13 12:58:14.451 | WARNING | utils.jellyfin:add_item_to_collection:110 - Item High Tension not found in jellyfin 2024-06-13 12:58:14.723 | INFO | utils.jellyfin:add_item_to_collection:115 - Added Oldboy to collection 2024-06-13 12:58:15.075 | WARNING | utils.jellyfin:add_item_to_collection:110 - Item Saw not found in jellyfin 2024-06-13 12:58:15.441 | INFO | utils.jellyfin:add_item_to_collection:115 - Added Eternal Sunshine of the Spotless Mind to collection 2024-06-13 12:58:15.801 | INFO | utils.jellyfin:add_item_to_collection:115 - Added The Village to collection 2024-06-13 12:58:16.149 | WARNING | utils.jellyfin:add_item_to_collection:110 - Item The Descent not found in jellyfin 2024-06-13 12:58:16.494 | INFO | utils.jellyfin:add_item_to_collection:115 - Added Caché to collection 2024-06-13 12:58:16.873 | INFO | utils.jellyfin:add_item_to_collection:115 - Added The Prestige to collection 2024-06-13 12:58:17.250 | INFO | utils.jellyfin:add_item_to_collection:115 - Added Atonement to collection 2024-06-13 12:58:18.141 | INFO | utils.jellyfin:add_item_to_collection:115 - Added Gone Baby Gone to collection 2024-06-13 12:58:18.495 | WARNING | utils.jellyfin:add_item_to_collection:110 - Item The Mist not found in jellyfin 2024-06-13 12:58:18.837 | WARNING | utils.jellyfin:add_item_to_collection:110 - Item Orphan not found in jellyfin 2024-06-13 12:58:19.198 | INFO | utils.jellyfin:add_item_to_collection:115 - Added Shutter Island to collection 2024-06-13 12:58:19.541 | WARNING | utils.jellyfin:add_item_to_collection:110 - Item Certified Copy not found in jellyfin 2024-06-13 12:58:19.819 | INFO | utils.jellyfin:add_item_to_collection:115 - Added Kill List to collection 2024-06-13 12:58:20.159 | WARNING | utils.jellyfin:add_item_to_collection:110 - Item Goodnight Mommy not found in jellyfin 2024-06-13 12:58:20.441 | INFO | utils.jellyfin:add_item_to_collection:115 - Added Gone Girl to collection 2024-06-13 12:58:20.807 | INFO | utils.jellyfin:add_item_to_collection:115 - Added Ex Machina to collection 2024-06-13 12:58:21.221 | INFO | utils.jellyfin:add_item_to_collection:115 - Added The Invitation to collection 2024-06-13 12:58:21.584 | WARNING | utils.jellyfin:add_item_to_collection:110 - Item Arrival not found in jellyfin 2024-06-13 12:58:22.100 | INFO | utils.jellyfin:add_item_to_collection:115 - Added Get Out to collection 2024-06-13 12:58:22.941 | WARNING | utils.jellyfin:add_item_to_collection:110 - Item Us not found in jellyfin 2024-06-13 12:58:23.322 | WARNING | utils.jellyfin:add_item_to_collection:110 - Item Parasite not found in jellyfin 2024-06-13 12:58:23.591 | INFO | utils.jellyfin:add_item_to_collection:115 - Added Malignant to collection 2024-06-13 12:58:23.984 | WARNING | utils.jellyfin:add_item_to_collection:110 - Item The Power of the Dog not found in jellyfin 2024-06-13 12:58:24.260 | WARNING | utils.jellyfin:add_item_to_collection:110 - Item Barbarian not found in jellyfin

collection-list-screenshot-2

ghomasHudson commented 5 months ago

Re-tried the same as above with the master branch from today. The results are certainly different, improved somewhat but still imperfect.

Yeah, it's very much a partial solution so far. It's working by default for the imdb lists etc...

For letterboxd, things are a little trickier as the list pages don't contain the imdb ids. They have to be requested for each one which slows things down a LOT.

I've added an imdb_id_filter option that you can enable if you want this turned on:

...
plugins:
  letterboxd:
    enabled: true
    imdb_id_filter: true
    list_ids:
      - ....

Ultimately, i'll add a plugin for the letterbox API which should solve this.

mikewesten commented 5 months ago

... plugins: letterboxd: enabled: true imdb_id_filter: true list_ids:

I added that line to my config file with the new branch and unfortunately this fails for me. I thought maybe letterboxd.py also needed the toggle switched on as well (it's False by default), but even with that it also fails.

$ python main.py 2024-06-14 19:40:42.709 | INFO | main::43 - Starting up 2024-06-14 19:40:42.710 | INFO | main::44 - Starting initial run 2024-06-14 19:40:42.898 | INFO | main:main:29 - 2024-06-14 19:40:42.898 | INFO | main:main:30 - 2024-06-14 19:40:42.898 | INFO | main:main:31 - Getting list info for plugin: letterboxd, list id: mitchell/list/indiewires-best-plot-twists-of-the-21st-century/ Traceback (most recent call last): File "main.py", line 45, in main(config) File "main.py", line 32, in main list_info = plugins[plugin_name].get_list(list_id, config['plugins'][plugin_name]) File "/home/mike/Applications/Jellyfin-Auto-Collections-master/plugins/letterboxd.py", line 29, in get_list r = requests.get(f"https://letterboxd.com{movie.find('a')['href']}", headers={'User-Agent': 'Mozilla/5.0'}) AttributeError: 'dict' object has no attribute 'find'

However it sounds like letterbox API will be a better solution anyways.

ghomasHudson commented 5 months ago

... plugins: letterboxd: enabled: true imdb_id_filter: true list_ids: - ....

I added that line to my config file with the new branch and unfortunately this fails for me. I thought maybe letterboxd.py also needed the toggle switched on as well (it's False by default), but even with that it also fails.

Just pushed some fixes which should solve this bug.

mikewesten commented 5 months ago

It solved that bug but now I'm just getting the same results as before minus that one Christmas movie. No matter, IMDB has some okay lists in the meantime.