divijbindlish / parse-torrent-name

Extract media information from a filename
MIT License
215 stars 60 forks source link

Much improved support for years parsing #30

Open dchevell opened 4 years ago

dchevell commented 4 years ago

I've reviewed the following PR's related to this:

They all take the approach of only supporting 2020-2029 or 2020-2039. I'm sure that seems "far enough" ahead, but I'm also sure the original author thought the same thing when they maxed out the original pattern at 2019.

A related issue: This still doesn't handling films with years in the title (e.g. "2001: A Space Odyssey", "Death Race 2000" or "2012") (https://github.com/divijbindlish/parse-torrent-name/issues/20 and https://github.com/divijbindlish/parse-torrent-name/issues/28)

This PR supports years up to 2099, and makes the following changes to support movies with years in the title

Added tests for the above examples, as well as one for a 2020 film

platelminto commented 4 years ago

Unfortunately, this project seems dead (I've emailed the author and nothing, as well as 0 activity on GitHub for the past few years), so pull requests are very unlikely to get merged - if you're looking for something that seems a bit more maintained, I'd use roidayan's fork, or my (now unforked) version.

As for the specifics here, keeping the valid dates relatively close (~decade?) to what they are today is helpful for titles such as "Blade Runner 2049" - many movie releases do already include the release year, in those cases it wouldn't be an issue, but not all do (the 1st result for Blade Runner on 1377x.to doesn't include the release year, for example). Something to keep in mind!

dchevell commented 4 years ago

In that case, I might just publish my fork under a different name.

To your point about handling movies like life Runner 2049, my proposed changes here would handle that case just fine without artificially limiting possible year matches to the 2020’s. The problem with the current approach is that if future years in titles could be mistaken for the release year, there are movie titles with current or past years that would run into the same problem (like the example in my PR, “Death Race 2000”). This change avoids that problem altogether.

platelminto commented 4 years ago

In this case, I am specifically talking about titles that end with a year, where the rest of the torrent name has no release year - there is always ambiguity in these cases, and yes with current or past years this would cause issues, but limiting the year can help to a certain extent.

The example I am specifically talking about is: 'Blade Runner 2049.HDRip.XviD.AC3-EVO'.

I had previously thought of programatically finding out what year it is, then use that as a maximum, but it seems a bit overkill for such an already specific edgecase.