Open sambhare opened 7 years ago
I am interested in this functionality too, to provide memory view into some std::vector to the javascript side.
From browsing the sources, it looks like we can not use nbind::Buffer
as a return type: https://github.com/charto/nbind/blob/d5b051aad177e975fa7e1a7f9009fc7c6ddf652e/include/nbind/v8/Buffer.h#L52. I'll try to write my own BindingType
instantiation for this, I'll let you know if it works =)
@Luthaf, thanks for looking into this! I think the reason it wasn't implemented before is complexities in the Emscripten target. JavaScript can receive a view of the C++ memory, but would have to explicitly release the reference because in browsers, the C++ code cannot know when the JavaScript object gets garbage collected. Then it's easy to leak memory in the C++ heap, which is already restricted to a constant size. Another option is to pass the buffer as a copy, but that negates many of the benefits.
Can you think of a solution portable also into Emscripten? Otherwise, this could still be released as a native-only feature.
I am not sure I understand the issue here. Do C++ destructors still run with Emscripten? Then the problem is that we don't know when to run destructors because the JS side could still be holding a reference to the buffer data?
My use case looks like this:
class Container {
private:
// buffer is something like a std::vector,
// except I can't access the actual std::vector directly
some_type buffer;
public:
size_t length() {return data.length();}
double* data() {return &buffer[0];}
}
I would like to provide read and write access to the data to the JS side, but I don't want the JS side to free the memory at any point. The C++ object is the owner of the memory.
This is fundamentally unsafe, as the vector can be resized and then the pointer becomes invalid. But this is OK for my use case. I've got something working with the following code, but I only implemented the node.js part for now.
Exactly, in Emscripten we can't know what references JS is holding. It's like security boxing is reversed compared to Node.js native code: C++ is limited to accessing a single typed array and has no knowledge of JS engine internals. Meanwhile JS has direct access to the entire C++ heap.
The current nbind::Buffer
holds a reference to memory owned by JavaScript. If a new buffer is returned, the memory is allocated by C++ which then would end up owning it. This needs to be handled very differently in the nbind::Buffer
class. In native Node.js it's important that the memory stays allocated until JS has lost all references to it, or the process will crash or allow security exploits. In Emscripten, either JS has to explicitly release references or the programmer must take care not to access de-allocated memory. However, the worst that can happen is the JS code behaving strangely.
Returning buffers would be more performant in Emscripten, because they can be views into the C++ heap. Meanwhile JS-allocated arrays are currently copied, because the Asm.js engine has no memory mapping support for directly accessing any JS object other than its own initial heap.
Ok, so what I need is different from nbind::Buffer
and more like an externalized v8 ArrayBuffer
. It could be made working with emscriten without too many issues (JS code behaving strangely at worst). For the node.js side, you use std::shared_ptr
to hold everything, am I right ? Then you could use the aliasing constructor of std::shared_ptr
to build the new shared buffer, making it use the same reference count as the memory owner. But I don't know if it is desirable to have in nbind, as it might be relatively hard to document, and the behaviour would change between node and emscripten.
It seems like a useful addition to nbind. I guess from C++ you'd return a struct by value, containing std::shared_ptr<unsigned char *>
and size_t len
. Then toWireType
would create a new externalized v8::ArrayBuffer
and a new heap-allocated C++ object holding an std::shared_ptr<unsigned char *>
to the data and a Nan::Persistent<v8::Object>
with a weak reference to the ArrayBuffer
. Once the weak reference GC callback fires, the object deletes itself which also resets the shared_ptr
it was holding.
Such an wrapper object already exists here in External.h. It seems it could be re-used, if the template argument is changed from Data
to DataPtr
. The addDestructor
function in that same file should then create it using new Destructor<Data *>(callback, data, handle);
and the new Buffer wrapper could instead use new Destructor<std::shared_ptr<unsigned char *> >(...);
passing the ArrayBuffer
as the handle.
The ArrayBuffer
is then passed to JavaScript, which doesn't need any additional wrappers around the buffer it receives... Except maybe it's possible to add a no-op free
method on it, for future Emscripten compatibility.
I guess you could use void *
in place of unsigned char *
. Or maybe the buffer class should be templated to hold the right type of pointer. So you in C++ you do return ExternalBuffer(data, len);
or callback.call<void>(ExternalBuffer(data, len));
where data
should be any shared_ptr
or unique_ptr
. I don't think naked pointers should be supported, it's too dangerous and unclear who will free them.
For those looking for a working example of this, I have a Node binding working here passing a buffer via a callback: https://github.com/openzim/node-libzim/blob/e790bfc55352a027f1e4d7e8acce9bfaf369c461/src/ZimReader.cc#L45
On the JS side I had to use a slice to avoid memory issues: https://github.com/openzim/node-libzim/blob/e790bfc55352a027f1e4d7e8acce9bfaf369c461/src/ZimReader.ts#L39
This is currently a bit of a hack and definitely not efficient, but it's working well enough for demos 😄
Is it possible to create an nbind::Buffer object inside a wrapped function and send that object to Javascript via a callback? I created the following example but all I see on the Javascript side a null.
Corresponding JavaScript code (buftest.js):
Outputs:
Attempting to
console.log(buf.length)
results in aTypeError: Cannot read property 'length' of null