Closed winbatch closed 1 year ago
--sensitive is a debugging option that doesn't change the behavior other than increase logging.
This seems to be an unfortunate subtitle, some blocks are bound to be false positives and I'll do what I can but some false positives are unfortunately unavoidable.
This is why I'm currently in the process of developing a reviewing process where you get to review all blocks that gets deleted but that could be a likely valid subtitle.
At this point the script is really very good at finding ads and not removing valid subtitles and while I can put even more time improving the included regex, I'll never be perfect and at some point there needs to be a small manual process that takes care of the edge cases.
For now you'll have to manually restore these files, I'll take a look if the included regex needs to be adjusted somewhat here. But if it's between this one false posivite or 10 ad blocks not getting removed, I'll probably lean towards living with the false positive.
ok. maybe we're on to something though - I misinterpreted --sensitive to mean something else. However, maybe you can have a flag that defines how 'aggressive' to be when cleaning. Not so different from how there are ranges for gzip for compression or like the number -v's for verbose, etc. So like least aggressive removes known specific/exact match strings. One step up does a bit more - like if it has http in its text or if the word 'subtitle' is in the text. next step from that, etc, etc..
I have no idea how this was removed. There is no regex that matches this.
| It's the celebrity sex tape
| to end all celebrity sex tapes.
"Celebrity sex" is a warning regex. I've seen this before, it's a office episode where they discuss this at the end of the show. Right?
I thought I fixed it then but I'll try again. Could you send me the original subtitle file?
Oh, I see probably in the other regex files. I only ever touch the english.conf.
No worries, not for me at least. I don't mind it. The script works pretty damn well already. Thanks for creating it :)
The first block is valid. (It's from the show Devs). I am running with --sensitive and it still removed it