infinitered / nsfwjs

NSFW detection on the client-side via TensorFlow.js
https://nsfwjs.com/
MIT License
7.97k stars 532 forks source link

bad memory leak #435

Open effen1337 opened 3 years ago

effen1337 commented 3 years ago

Hello, I'm having a bad memory leak in production after implementing nsfwjs. I'm talking gigs of memory.

const axios = require('axios')
const tf = require('@tensorflow/tfjs-node');
tf.enableProdMode();
const nsfw = require('nsfwjs');
const modelFunc = async () => await nsfw.load();
let model;

setInterval(() => {
    tf.disposeVariables()
console.log(tf.memory()) // { unreliable: true,  numTensors: 49, numDataBuffers: 49, numBytes: 532830292 } and keeps increasing
}, 5000)

async function scan(url) {
    if (!model) {
        console.log('no model, loading.....')
        model = await modelFunc();
    }
    const pic = await axios.get(url, {
        responseType: 'arraybuffer',
    });
    tf.engine().startScope();
    const image = await tf.node.decodeImage(pic.data, 3)
    const predictions = await model.classify(image)
    image.dispose()
    tf.engine().endScope();
    return predictions;
}

for (const nsfwURL of [...(alex000kim / nsfw_data_scraper_chunk_ofurls)]) { //a chunk of nsfw image urls
    scan(nsfwURL).then(e => console.log(e))
}

I sometimes get TypeError: Cannot read property 'backend' of undefined; (@tensorflow/tfjs-core/dist/tf-core.node.js:3280:31)

Either way, the memory is never free'd. What am i doing wrong?

GantMan commented 3 years ago

It's a bit confusing the way this is constructed, but I see you're doing a lot of good things.

The thing I'm confused by, is why are you tf.disposeVariables() on an interval? Does that cause the model to have to reload? I'm not familiar with startScope/endScope. Are you sure you should be managing the global engine like that?

Do you have an open source GitHub repo you can link me to, so I can pull down the code and run it and see if I can fix it?

effen1337 commented 3 years ago

@GantMan It's actually a worker that exposes the "scan" function, the loop part is just to show how I'm supplying the urls.

I used tf.disposeVariables() in an interval just to see the effect it would have on memory and also the results of tf.memory(); I had tf.memory() return a VERY high numTensors number (over 500). tf.disposeVariables() reduced it and only few are returned now (around 50); The memory leak however wasn't fixed.

<image>.dispose() ( or tf.dispose(image) ) had absolutely no effect sadly.

As for tf.engine().startScope() and tf.engine().endScope(), I noticed online:

The way to clean any unused tensors in async code is to wrap the code that creates them between a startScope() and an endScope() call.

I do admit, the two might be the cause of the error I had mentioned above Either way, the memory leak was never resolved.

The code shown above is enough to actually reproduce this behavior on node

GantMan commented 3 years ago

Hrmmm, TBH, I'm not seeing the source. Perhaps the model is getting created over and over? You could try model.model.dispose(). But I'll have to create a downloadable runnable debuggable demo to find it. I don't think it's the library, but I could be wrong, of course! Strange stuff indeed.

If you have a small, open source, ready to run demo I'll pull it down and throw an hour at it.

If I were you, I'd comment out line by line and watch tensor memory with a log.

GantMan commented 3 years ago

Was this resolved?

phamleduy04 commented 3 years ago
Screen Shot 2021-02-23 at 8 34 56 AM

I have same problem with this too. The image.dispose() seems doesn't work.

GantMan commented 3 years ago

Can I see some code? Are you using ts-node?

abdullahIsa commented 7 months ago
disposeVariables

i am having this issue, any solution ?