nodejs / node-addon-api

Module for using Node-API from C++
MIT License
2.15k stars 460 forks source link

How to properly handle `Object/Buffer/Uint8Array` in `AsyncWorker`? #1453

Closed mahnunchik closed 7 months ago

mahnunchik commented 7 months ago

Hello, I couldn't find in the documentation and examples how to properly handle Object/Buffer/Uint8Array parameters in AsyncWorker?

The task: I'd like to receive data as Uint8Array and use it in AsyncWorker.

If I understand correctly, the data can be garbage collected.

Should I:

Naive code:

#include "napi.h"

class DoHeavyMathWorker : public Napi::AsyncWorker {
  public:
    DoHeavyMathWorker(const Napi::Env& env, uint8_t* data)
      : Napi::AsyncWorker{env, "DoHeavyMathWorker"},
      m_deferred{env},
      m_data{data}
      {}

    Napi::Promise GetPromise() { return m_deferred.Promise(); }

  protected:
    void Execute() {
      // heavy math with data
      if (m_data[0] == 0) {
        m_result = 1;
      } else {
        m_result = 2;
      }
    }

    void OnOK() { m_deferred.Resolve(Napi::Number::New(Env(), m_result)); }
    void OnError(const Napi::Error& err) { m_deferred.Reject(err.Value()); }

  private:
    Napi::Promise::Deferred m_deferred;
    uint8_t* m_data;
    uint32_t m_result;
};

Napi::Value DoHeavyMath(const Napi::CallbackInfo& info) {
  Napi::Env env = info.Env();

  if (!info[0].IsObject()) {
    Napi::TypeError::New(env, "arg must be an object")
        .ThrowAsJavaScriptException();
    return env.Undefined();
  }
  Napi::Object obj = info[0].As<Napi::Object>();
  uint8_t* data = obj.Get("data").As<Napi::Uint8Array>().Data();

  DoHeavyMathWorker* worker = new DoHeavyMathWorker(env, data);
  worker->Queue();
  return worker->GetPromise();
}

Napi::Object Init(Napi::Env env, Napi::Object exports) {
  exports.Set("doHeavyMath", Napi::Function::New(env, DoHeavyMath));
  return exports;
}

NODE_API_MODULE(NODE_GYP_MODULE_NAME, Init)

JS part:

const result = await addon.doHeavyMath({
  data: new Uint8Array([42]),
});
console.log('result:', result);

mahnunchik commented 7 months ago

Yes, I understand what it is possible to copy data in DoHeavyMathWorker constructor but my goal is to use raw/shared data from js side.

legendecas commented 7 months ago

It should be possible to keep the raw memory of an Uint8Array alive across the lifetime of an AsyncWorker by using ObjectReference to the Uint8Array. But there is no way to keep a reference to the backing store of an ArrayBuffer, which can be detached from the ArrayBuffer object by ArrayBuffer.prototype.transfer or many APIs.

At the moment, if you are pretty sure that the ArrayBuffer will not be detached during the work of the AsyncWorker, an ObjectReference to the Uint8Array/ArrayBuffer would be sufficient.

We should also consider introducing a new node-api, such as napi_tranfer_arraybuffer, would be more promising in such cases to ensure that the memory of the given ArrayBuffer will not be collected.

mahnunchik commented 7 months ago

Let's look at two cases:

My case

I'm fully confident that the content of Uint8Array will not be changed in any way while the async worker is running. And even some time after execution, the Uint8Array will also remain unchanged in scope of function call. Example js code:

async function() {
  const params = {
    data: new Uint8Array([42]),
  };

  const result = await addon.doHeavyMath(params);

  params.result = result;

  console.log('result:', result);
}

Question 1: Do I understand correctly what I can use Uint8Array as is without wrapping in ObjectReference and and it doesn't have to be garbage collected?

Common case

Of course, the safest way is to copy the raw data.

To use the data without copying it into a link, it can be wrapped:

  1. Whole object (or for example Arrray of Uint8Array-s)
  2. Uint8Array (or each Uint8Array from arguments)

Question 2: Which option is correct? If both are correct, what are the pros and cons of each?

Previous example modified to wrap object as ObjectReference.

#include "napi.h"

class DoHeavyMathWorker : public Napi::AsyncWorker {
  public:
    DoHeavyMathWorker(const Napi::Env& env, Napi::Object& obj, uint8_t* data)
      : Napi::AsyncWorker{env, "DoHeavyMathWorker"},
      m_reference{Napi::ObjectReference::New(obj, 1)},
      m_deferred{env},
      m_data{data}
      {}

    Napi::Promise GetPromise() { return m_deferred.Promise(); }

  protected:
    void Execute() {
      // heavy math with data
      if (m_data[0] == 0) {
        m_result = 1;
      } else {
        m_result = 2;
      }
    }

    void OnOK() {
      m_deferred.Resolve(Napi::Number::New(Env(), m_result));
      m_reference.Unref();
    }
    void OnError(const Napi::Error& err) {
      m_deferred.Reject(err.Value());
      m_reference.Unref();
    }

  private:
    Napi::ObjectReference m_reference;
    Napi::Promise::Deferred m_deferred;
    uint8_t* m_data;
    uint32_t m_result;
};

Napi::Value DoHeavyMath(const Napi::CallbackInfo& info) {
  Napi::Env env = info.Env();

  if (!info[0].IsObject()) {
    Napi::TypeError::New(env, "arg must be an object")
        .ThrowAsJavaScriptException();
    return env.Undefined();
  }
  Napi::Object obj = info[0].As<Napi::Object>();
  uint8_t* data = obj.Get("data").As<Napi::Uint8Array>().Data();

  DoHeavyMathWorker* worker = new DoHeavyMathWorker(env, obj, data);
  worker->Queue();
  return worker->GetPromise();
}

Napi::Object Init(Napi::Env env, Napi::Object exports) {
  exports.Set("doHeavyMath", Napi::Function::New(env, DoHeavyMath));
  return exports;
}

NODE_API_MODULE(NODE_GYP_MODULE_NAME, Init)

Question 3: Should I get uint8_t* data in outer method DoHeavyMath or I can call uint8_t* data = obj.Get("data").As<Napi::Uint8Array>().Data(); inside AsyncWorker?

mahnunchik commented 7 months ago

@legendecas could you please help me to figure out the difference between references. This is the same question on SO: https://stackoverflow.com/questions/78031729/how-to-properly-handle-object-buffer-uint8array-in-asyncworker

legendecas commented 7 months ago

Q1: Yes, you can create an ObjectReference on a Uint8Array as long as you still need to access the buffer. Q2: Sorry, I didn't understand the question. Q3: It is fine to access the Uint8Array inside AsyncWorker::OnOk, AsyncWorker::OnError, and other methods that are not called in AsyncWorker::Execute which is called off the JS thread.

Note that persisting the raw memory data pointer and releasing the Uint8Array is not safe since the raw memory buffer may be released once the Uint8Array is been reclaimed by GC.

I hope this helps.

mahnunchik commented 7 months ago

@legendecas Thank you for your answers.

More details about the second question.

For example for the following js code:

async function() {
  const params = {
    // Array of Uint8Array
    data: [
      new Uint8Array([42]),
      new Uint8Array([24]),
      new Uint8Array([ff]),
    ]
  };

  const result = await addon.doHeavyMath(params);

  console.log('result:', result);
}

Should I wrap each Uint8Array as an ObjectReference or I can wrap whole data Array as single ObjectReference?

legendecas commented 7 months ago

You can create a reference for the array as long as you are confident that the array is not modified by outer scope.

In fact, you can avoid creating any reference by a proper model in JavaScript:

class MyTask {
  #buffer;
  constructor(buffer) {
    this.#buffer = buffer;
  }

  async startWork() {
    const result = await addon.doHeavyMath(this.#buffer);
    return result;
  }
}

The buffer is referenced by a JS object and the JS object will not be reclaimed by GC as long as the task of the async worker is not finished.

mahnunchik commented 7 months ago

Thank you for the clarification. I think an example of using Uint8Array/ObjectReference can be added here https://github.com/nodejs/node-addon-examples/issues/380

legendecas commented 7 months ago

Closing as the issue has been answered. Please feel free to reopen if there is anything else to discuss.