web-mech / badwords

A javascript filter for badwords
MIT License
631 stars 324 forks source link

Passing non-ASCII content to Filter.clean gives a slightly cryptic TypeError #167

Open Julian opened 1 year ago

Julian commented 1 year ago

Running:

> var Filter = require('bad-words'),
...     filter = new Filter();
undefined
> filter.clean("שלום עליכם");

produces:

Uncaught TypeError: Cannot read properties of null (reading '0')
    at Filter.clean (/Users/julian/Development/library-api-v2/node_modules/.pnpm/bad-words@3.0.4/node_modules/bad-words/lib/badwords.js:58:41)

where what's happening is ultimately that the \b JS regex isn't very happy with non-ASCII boundaries.

I know this library seems essentially to be for English/ASCII but perhaps it's worth considering either producing a more explicit error or perhaps returning non-ASCII input unchanged.