clement-escolano / parse-torrent-title

Parse torrent title to extract useful information
MIT License
32 stars 22 forks source link

Support multi-episode torrents? #4

Open FezVrasta opened 6 years ago

FezVrasta commented 6 years ago

Thanks for the library!

I need to extract the episodes contained into a single torrent name, for example:

Marvel\'s.Agents.of.S.H.I.E.L.D.S02E01-03.Shadows.1080p.WEB-DL.DD5.1

This should, ideally, return as season an array like [1, 2, 3], but right now it returns just 1.

I use this RegExp in my own code right now but I think it would make sense to have this built into the library:

/[Eex]([0-9]{2})(?:\-)?([0-9]{2})?(?:[^0-9]|$)/g

// result: 1, 3 - which can then be used to infer the episodes in the middle of the sequence
clement-escolano commented 6 years ago

Hello! Indeed multi episode release names does not work currently with this library. Thank you for your RegExp, I will use it to integrate the feature soon. Right now it does not work because the structure of the regex handlers returns only the first match. But I will look into a way to make it more capable. Have a good day

jy95 commented 6 years ago

FYI the ugly regex used by sonarr (C# project) to handle this case :

//Multi-episode Repeated (S01E05 - S01E06, 1x05 - 1x06, etc)
                new Regex(@"^(?<title>.+?)(?:(?:[-_\W](?<![()\[!]))+S?(?<season>(?<!\d+)(?:\d{1,2}|\d{4})(?!\d+))(?:(?:[ex]|[-_. ]e){1,2}(?<episode>\d{1,3}(?!\d+)))+){2,}",
                          RegexOptions.IgnoreCase | RegexOptions.Compiled),

                //Episodes with a title, Single episodes (S01E05, 1x05, etc) & Multi-episode (S01E05E06, S01E05-06, S01E05 E06, etc) **
                new Regex(@"^(?<title>.+?)(?:(?:[-_\W](?<![()\[!]))+S?(?<season>(?<!\d+)(?:\d{1,2}|\d{4})(?!\d+))(?:[ex]|\W[ex]|_){1,2}(?<episode>\d{2,3}(?!\d+))(?:(?:\-|[ex]|\W[ex]|_){1,2}(?<episode>\d{2,3}(?!\d+)))*)\W?(?!\\)",
                          RegexOptions.IgnoreCase | RegexOptions.Compiled),

//Multi-episode release with no space between series title and season (S01E11E12)
                new Regex(@"(?:.*(?:^))(?<title>.*?)(?:\W?|_)S(?<season>(?<!\d+)\d{2}(?!\d+))(?:E(?<episode>(?<!\d+)\d{2}(?!\d+)))+",
                          RegexOptions.IgnoreCase | RegexOptions.Compiled),

                //Multi-episode with single episode numbers (S6.E1-E2, S6.E1E2, S6E1E2, etc)
                new Regex(@"^(?<title>.+?)[-_. ]S(?<season>(?<!\d+)(?:\d{1,2}|\d{4})(?!\d+))(?:[-_. ]?[ex]?(?<episode>(?<!\d+)\d{1,2}(?!\d+)))+",
                          RegexOptions.IgnoreCase | RegexOptions.Compiled),

//4 digit episode number
                //Episodes with a title, Single episodes (S01E05, 1x05, etc) & Multi-episode (S01E05E06, S01E05-06, S01E05 E06, etc)
                new Regex(@"^(?<title>.+?)(?:(?:[-_\W](?<![()\[!]))+S?(?<season>(?<!\d+)\d{1,2}(?!\d+))(?:(?:\-|[ex]|\W[ex]|_){1,2}(?<episode>\d{4}(?!\d+|i|p)))+)\W?(?!\\)",
                          RegexOptions.IgnoreCase | RegexOptions.Compiled),

We are lucky that naming groups will be in ES2018 so there will no big problem to reuse this code ^^