zzfet / stash-c4s-pyscraper

Python-based Clips4Sale scraper for Stash
0 stars 0 forks source link

Handling of missing/censored words #4

Open zzfet opened 1 year ago

zzfet commented 1 year ago

Improved handling of censored words would be very useful. At the moment, a lot of these censored words are renders as double-spaces. Some sort of [CENSORED] indicator would be useful to indicate that a word used to be there. Need to find a way of identifying these in the description text.

zzfet commented 1 year ago

Automatic find/replace of known censored word bypasses (for example, p0pp3rs to get around censorship of the word 'poppers' - I've seen this in the description of some videos) would be good. I wonder if there's an established method of converting 'l33t speak' into conventional text in Python...?

For starters, clips from the studio 'HumiliationPOV' render the words 'hypnotic' and 'hypno' as 'n0 t!c' and 'n0'. Some login saying: "If studio is HumiliationPOV, do find/replace on these terms" would be handy.