jo3-l / obscenity

Robust, extensible profanity filter for NodeJS
MIT License
70 stars 2 forks source link

bug: Memory leak when using an empty string #30

Closed jwoertink closed 1 year ago

jwoertink commented 1 year ago

Expected behavior

Proper error message

Actual behavior

JavaScript heap out of memory

❯ node index.js 

<--- Last few GCs --->

[79252:0x4d7f380]    21093 ms: Mark-sweep (reduce) 4080.9 (4142.9) -> 4080.7 (4141.9) MB, 1573.5 / 0.0 ms  (+ 1.8 ms in 2 steps since start of marking, biggest step 1.8 ms, walltime since start of marking 1584 ms) (average mu = 0.146, current mu = 0.098) [79252:0x4d7f380]    21095 ms: Scavenge 4082.3 (4142.4) -> 4081.3 (4143.4) MB, 1.2 / 0.0 ms  (average mu = 0.146, current mu = 0.098) allocation failure 

<--- JS stacktrace --->

FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
 1: 0xafedf0 node::Abort() [/home/jeremy/.asdf/installs/nodejs/16.6.1/bin/node]
 2: 0xa1814d node::FatalError(char const*, char const*) [/home/jeremy/.asdf/installs/nodejs/16.6.1/bin/node]
 3: 0xce795e v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [/home/jeremy/.asdf/installs/nodejs/16.6.1/bin/node]
 4: 0xce7cd7 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [/home/jeremy/.asdf/installs/nodejs/16.6.1/bin/node]
 5: 0xeb16b5  [/home/jeremy/.asdf/installs/nodejs/16.6.1/bin/node]
 6: 0xeb21a4  [/home/jeremy/.asdf/installs/nodejs/16.6.1/bin/node]
 7: 0xec0617 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [/home/jeremy/.asdf/installs/nodejs/16.6.1/bin/node]
 8: 0xec39cc v8::internal::Heap::AllocateRawWithRetryOrFailSlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [/home/jeremy/.asdf/installs/nodejs/16.6.1/bin/node]
 9: 0xe862ec v8::internal::Factory::NewFillerObject(int, bool, v8::internal::AllocationType, v8::internal::AllocationOrigin) [/home/jeremy/.asdf/installs/nodejs/16.6.1/bin/node]
10: 0x11f3156 v8::internal::Runtime_AllocateInYoungGeneration(int, unsigned long*, v8::internal::Isolate*) [/home/jeremy/.asdf/installs/nodejs/16.6.1/bin/node]
11: 0x15c9ed9  [/home/jeremy/.asdf/installs/nodejs/16.6.1/bin/node]
Aborted (core dumped)

Minimal reproducible example

import {
  DataSet,
  RegExpMatcher,
  englishRecommendedTransformers,
  pattern,
  parseRawPattern,
} from "obscenity";

const customDataset = new DataSet();
const bannedChatWords = [""];
bannedChatWords.forEach((item, _idx) => {
  const word = item.toLowerCase();
  customDataset.addPhrase((phrase) => {
    return phrase
      .setMetadata({ originalWord: word })
      .addPattern(parseRawPattern(word))
      .addPattern(pattern`|${word}`)
      .addPattern(pattern`${word}|`)
  });
});

const customMatcher = new RegExpMatcher({
  ...customDataset.build(),
  ...englishRecommendedTransformers
});

function messageViolation(message) {
  return  customMatcher.getAllMatches(message).length > 0;
}

console.log("test", messageViolation("test"))

Steps to reproduce

  1. Save that code to index.js
  2. run node index.js
  3. ...
  4. Profit?

Additional context

The words come from user generated content. My app was improperly storing an empty string. When the code would try to dynamically generate the banned word list with an empty string in the mix, it would tank the site.

Node.js version

v16.6.1

Obscenity version

obscenity@^0.1.1: version "0.1.1"

Priority

Terms

jo3-l commented 1 year ago

Thanks for the report. I've identified the issue and will have a fix out shortly.

jo3-l commented 1 year ago

Released 0.1.4 with a fix; let me know if that works for you and of course feel free to reopen if anything still seems off @jwoertink.