Closed rion18 closed 4 months ago
Using obscenity to censor a string containing an emoji, like this one: 🤣bummer, and a dataset that contains the word bummer.
obscenity
🤣bummer
bummer
Using this strategy,
const CENSOR_STRATEGY = (censorContext) => ''.repeat(censorContext.matchLength);
for removing the profanities,
The expected output would be 🤣.
🤣
Instead, the output is this: 🤣b. It matches the word bummer correctly, BUT when the matcher tries to find the matches, there's an error in the index.
🤣b
const { englishDataset, parseRawPattern, DataSet, RegExpMatcher, } = require('obscenity'); const data = new DataSet() .addAll(englishDataset) .addPhrase(phrase => phrase .setMetadata({ originalWord: 'bummer' }) .addPattern(parseRawPattern('bummer')) ).build(); const matcher = new RegExpMatcher({ ...profanityDataset, // no transformers }); const stringBummer = '🤣bummer'; if (matcher.hasMatch(stringBummer)) { const matches = matcher.getAllMatches(stringBummer, true); return textCensor.applyTo(stringBummer, matches); } return stringBummer;
No response
18.17.1
0.3.1
Thanks for the short repro. I think I know what the issue is and will take a stab at fixing it today.
Fix released in v0.4.0.
Expected behavior
Using
obscenity
to censor a string containing an emoji, like this one:🤣bummer
, and a dataset that contains the wordbummer
.Using this strategy,
for removing the profanities,
The expected output would be
🤣
.Actual behavior
Instead, the output is this:
🤣b
. It matches the wordbummer
correctly, BUT when the matcher tries to find the matches, there's an error in the index.Minimal reproducible example
Steps to reproduce
No response
Additional context
No response
Node.js version
18.17.1
Obscenity version
0.3.1
Priority
Terms