Open gargolito opened 1 year ago
This issue has been discussed many times with other shows/movies ending with us word.
I'll let the issue open though as I also want this to be fixed. Maybe we could consider US as a country only if it's uppercase, when it's not bounded by other matches.
I have another weird thing. There was an episode of Ted Lasso filename had this in the name with season, episode and number: S03E03.4.5.1
guessit thought it was a multi-episode. I can see why. I haven't looked at your code yet but are you using any kind of NLP like spacy? That might help with calculating distance between words.
guessit Ted.Lasso.S03E03.4-5-1.mkv
For: Ted.Lasso.S03E03.4-5-1.mkv
GuessIt found: {
"title": "Ted Lasso",
"season": 3,
"episode": [
3,
4,
5
],
"episode_title": "1",
"screen_size": "1080p",
"streaming_service": "AppleTV",
"source": "Web",
"audio_codec": "Dolby Digital Plus",
"audio_channels": "5.1",
"video_codec": "H.264",
"release_group": "NTb",
"container": "mkv",
"mimetype": "video/x-matroska",
"type": "episode"
}
`
I haven't looked at your code yet but are you using any kind of NLP like spacy?
No, it's just a big bunch of regexp and rules to solve conflicts between matches. I'm pretty sure some IA based algorithm could perform nicely for parsing movies/series filenames, but guessit is not based on any of those.
This issue has been discussed many times with other shows/movies ending with us word.
I'll let the issue open though as I also want this to be fixed. Maybe we could consider US as a country only if it's uppercase, when it's not bounded by other matches.
Hi again @Toilal , are you going to implement this? i'm in favor of this
guessit The.Last.of.Us.S01E01 For: The.Last.of.Us.S01E01