KBlixt / subcleaner

removes ads from subtitle files cleanly.
288 stars 13 forks source link

add indonesian config #19

Closed yatagasaru closed 1 year ago

yatagasaru commented 1 year ago

Default config for Indonesian subtitle

yatagasaru commented 1 year ago

I'll accept the request if you revert the .gitignore file.

reverted

The global config already covers web-sites. so regex6 could be reduced to .(id|my). your config might otherwise be oversensitive to websites in the subtitle which are legitimate.

this is intended because ads with WWW.someshaddywebsite.COM are a very common pattern in id subtitles. If regex6 only includes (id|my), some ads will be missed

KBlixt commented 1 year ago

Yes, but if you have just as I requested it will still recognize www.shadysite.com since the global config contain regex for that pattern.

Both the global regex and the dedicated language regex are ran on each file.

I will merge this but I would still suggest that you take a look at this.

If the subtitle would contain a legit site since they are talking about a site in the movie then if you have multiple regex for .com then those would get thrown out since it would see www and .com twice.

yatagasaru commented 1 year ago

Yes, but if you have just as I requested it will still recognize www.shadysite.com since the global config contain regex for that pattern.

Both the global regex and the dedicated language regex are ran on each file.

I will merge this but I would still suggest that you take a look at this.

If the subtitle would contain a legit site since they are talking about a site in the movie then if you have multiple regex for .com then those would get thrown out since it would see www and .com twice.

Thank you. Actually I'm still continuing to test this config. But for now within my collection, this config can remove all ads with just 1 false detection, and that's only because the text block sits between two ad blocks