jo3-l / obscenity

Robust, extensible profanity filter for NodeJS
MIT License
70 stars 2 forks source link
antiswear bad-words obscene obscenity profane profanity swear-filter swear-filtering swearing swearwords

Obscenity

Robust, extensible profanity filter for NodeJS.

Build status Codecov status npm version

Language

License

Why Obscenity?

Installation

$ npm install obscenity
$ yarn add obscenity
$ pnpm add obscenity

Example usage

First, import Obscenity:

const {
    RegExpMatcher,
    TextCensor,
    englishDataset,
    englishRecommendedTransformers,
} = require('obscenity');

Or, in TypeScript/ESM:

import {
    RegExpMatcher,
    TextCensor,
    englishDataset,
    englishRecommendedTransformers,
} from 'obscenity';

Now, we can create a new matcher using the English preset.

const matcher = new RegExpMatcher({
    ...englishDataset.build(),
    ...englishRecommendedTransformers,
});

Now, we can use our matcher to search for profanities in the text. Here's two examples of what you can do:

Check if there are any matches in some text:

if (matcher.hasMatch('fuck you')) {
    console.log('The input text contains profanities.');
}
// The input text contains profanities.

Output the positions of all matches along with the original word used:

// Pass "true" as the "sorted" parameter so the matches are sorted by their position.
const matches = matcher.getAllMatches('ʃ𝐟ʃὗƈk ỹоứ 𝔟ⁱẗ𝙘ɦ', true);
for (const match of matches) {
    const { phraseMetadata, startIndex, endIndex } =
        englishDataset.getPayloadWithPhraseMetadata(match);
    console.log(
        `Match for word ${phraseMetadata.originalWord} found between ${startIndex} and ${endIndex}.`,
    );
}
// Match for word fuck found between 0 and 6.
// Match for word bitch found between 12 and 18.

Censoring matched text:

To censor text, we'll need to import another class: the TextCensor. Some other imports and creation of the matcher have been elided for simplicity.

const { TextCensor, ... } = require('obscenity');
// ...
const censor = new TextCensor();
const input = 'fuck you little bitch';
const matches = matcher.getAllMatches(input);
console.log(censor.applyTo(input, matches));
// %@$% you little **%@%

This is just a small slice of what Obscenity can do: for more, check out the documentation.

Accuracy

Note: As with all swear filters, Obscenity is not perfect (nor will it ever be). Use its output as a heuristic, and not as the sole judge of whether some content is appropriate or not.

With the English preset, Obscenity (correctly) finds matches in all of the following texts:

...and it does not match on the following:

Documentation

For a step-by-step guide on how to use Obscenity, check out the guide.

Otherwise, refer to the auto-generated API documentation.

Contributing

Issues can be reported using the issue tracker. If you'd like to submit a pull request, please read the contribution guide first.

Author

Obscenity © Joe L. under the MIT license. Authored and maintained by Joe L.

GitHub @jo3-l