exebetche / vlsub

VLC extension to download subtitles from opensubtitles.org
1.34k stars 346 forks source link

Improved the logic for detecting the season number from the file name, considering seasons with two digits #124

Closed ClintEsteMadera closed 8 years ago

ClintEsteMadera commented 8 years ago

I've noticed that, for example, for Family Guy Season 14, the season/episode detection algorithm fails to parse a two-digit number, wrongly recognizing the season as 4, instead of 14 (see screenshot below)

fg

I'm a developer and even though I have no experience with Lua, I believe the issue is at the line 1478 (seeing master here), more precisely with this regex:

"(.+)(%d)[xX](%d%d).*"

The problem should be resolved by just adding a "+" after the first %d bit, to consider potential two-digit seasons. For some reason, I've done this locally and it didn't work (even though this is a valid wildcard according to Lua's documentation, maybe VLC uses an old version?)

So, the code that actually works (and it's definitively not ideal at all, but it does the trick...) is this one:

    if not showName then
      showName, seasonNumber, episodeNumber = string.match(
      openSub.file.cleanName,
      "(.+)(%d%d)[xX](%d%d).*")
      -- If seasonNumber is set as a single digit, use a different regex.
      -- TODO: Make the regex "(.+)(%d+)[xX](%d%d).*" work.
      if not seasonNumber then
        showName, seasonNumber, episodeNumber = string.match(
        openSub.file.cleanName,
        "(.+)(%d)[xX](%d%d).*")
      end
    end

This deals well with things like this:

fg-fixed

I humbly suggest to consider a fix to this (small, granted) issue, either by merging this solution or even better, getting LUA to accept the "+" symbol in the original regexp.

I am a big fan of this incredible extension and I'm glad to try to put my two cents here.

Kind Regards.

exebetche commented 8 years ago

Hi, thank you for reporting this, you're right, the regex didn't accept two digit season numbers. I didn't want to add a new regex just for that, so I choose to use this regex "(.-)(%d?%d)xX.*". This makes the first parentheses non-greedy by using '(.-)' instead of '(.+)'. Thank you again for your support.