KBlixt / subcleaner

removes ads from subtitle files cleanly.
288 stars 13 forks source link

False Positive on Season X Episode X #36

Closed araujojr82 closed 1 year ago

araujojr82 commented 1 year ago

Hi, I'm trying to build a Portuguese REGEX file, and in my tests I stumbled upon this regex on the global.conf: global_purge3: s(eason)?\W\d+[^,]\We(pisode)?\W*\d+[^,]

This Regex is removing the following line from my file:

      |     00:21:19,280 --> 00:21:21,760
      |     casos famosos,
      |     especialmente nos anos 70 e 80,

Which translates to "famous cases, especially in the 70s and 80s,"

It is matching this part: anos 70 e 80.

I think a good solution for this would be to check the character before the 's(eason)', is this possible somehow?

KBlixt commented 1 year ago

I added a word break infront of the s in the mentioned regex so that this should no longer be an issue while not breaking any existing solutions.

Give it a try and let me know 👍

Let me know how it turns out, I'll give it a quick look if you want to share before adding it but I'm afraid I can't help too much since I'm really tight on time currently

araujojr82 commented 1 year ago

Thank you!! It's working great!

I'll let you know once I'm done and maybe you can use it.