emscripten-core / emscripten

Emscripten: An LLVM-to-WebAssembly Compiler
Other
25.38k stars 3.25k forks source link

Embind vectors, sets or maps to be converted to JS Arrays, Sets and Maps #11070

Open skonstant opened 4 years ago

skonstant commented 4 years ago

Why do std::vectors need to be registered in embind and then appear in JS as "vector" types and not as array? This makes them not iterable in for loops or in Angular templates.

Same for maps or unordered_maps could be JS Map.

And set or unordered_set could be JS Set.

This would come really handy as they could be used in JS without knowing they come from C++ and changing the JS code to adapt to some C++ constructs (iteration is probably the first one that comes to mind).

kleisauke commented 4 years ago

I'd like to see this too. As a workaround I allow both JS arrays and std::vector classes as input.

Details ```cpp /** * Determines whether an JS value is a specified type. */ inline bool is_type(emscripten::val value, const std::string &type) { return value.typeOf().as() == type; } inline bool is_vector(emscripten::val value) { return is_type(value["size"], "function"); } /** * Converts an input array to a vector. */ template std::vector to_vector(emscripten::val v) { std::vector rv; if (v.isArray()) { rv = emscripten::vecFromJSArray(v); } else if (is_vector(v)) { rv = v.as>(); } else { // Allow single values as well rv = {v.as()}; } return rv; } EMSCRIPTEN_BINDINGS(my_module) { // Register vector bindings register_vector("VectorInt"); register_vector("VectorDouble"); register_vector("VectorString"); function("inputIntVector", optional_override([](emscripten::val value) { std::vector vector = to_vector(value); // ... })); function("inputDoubleVector", optional_override([](emscripten::val value) { std::vector vector = to_vector(value); // ... })); function("inputStringVector", optional_override([](emscripten::val value) { std::vector vector = to_vector(value); // ... })); } ```

Perhaps std::vector should also appear as JS array upon returning from C++? As an example, this makes such conversions unnecessary if supported:

Details ```js function* vector_values(vector) { for (let i = 0; i < vector.size(); i++) yield vector.get(i); vector.delete(); } const ints = [...vector_values(emval_test_return_vector())]; console.log(ints); // (3) [10, 20, 30] ``` https://github.com/emscripten-core/emscripten/blob/0ea8070948c030dd681a524162d420d501b95e9a/tests/embind/embind_test.cpp#L1216-L1219
octopoulos commented 3 years ago

For the std::vector output, it can be converted in javascript wit this line for example:

new Array(moves.size()).fill(0).map((_, id) => moves.get(id))

And then you're free to iterate this array.

mattbradley commented 3 years ago

You should be able to define implicit bindings for any std::vector using custom marshalling. Something like this:

namespace emscripten {
namespace internal {

template <typename T, typename Allocator>
struct BindingType<std::vector<T, Allocator>> {
    using ValBinding = BindingType<val>;
    using WireType = ValBinding::WireType;

    static WireType toWireType(const std::vector<T, Allocator> &vec) {
        return ValBinding::toWireType(val::array(vec));
    }

    static std::vector<T, Allocator> fromWireType(WireType value) {
        return vecFromJSArray<T>(ValBinding::fromWireType(value));
    }
};

template <typename T>
struct TypeID<T,
              typename std::enable_if_t<std::is_same<
                  typename Canonicalized<T>::type,
                  std::vector<typename Canonicalized<T>::type::value_type,
                              typename Canonicalized<T>::type::allocator_type>>::value>> {
    static constexpr TYPEID get() { return TypeID<val>::get(); }
};

}  // namespace internal
}  // namespace emscripten

This will automatically convert a JS array to a std::vector (for C++ function parameters) and a std::vector to a JS array (for C++ return values) without having to mess with register_vector as long as the T type in std::vector<T> has bindings defined.

struct NumWrapper {
    double num;
};

std::vector<NumWrapper> sort(std::vector<NumWrapper> nums) {
    std::sort(nums.begin(), nums.end(), [](const NumWrapper &a, const NumWrapper &b) {
        return a.num < b.num;
    });

    return nums;
}

EMSCRIPTEN_BINDINGS(some_module) {
    value_object<NumWrapper>("NumWrapper").field("num", &NumWrapper::num);
    function("sort", &sort);

    // `register_vector<NumWrapper>` isn't needed; vectors are implicitly converted to and from JS arrays.
}
Module.sort([{num: 2}, {num: 1}, {num: 3}]);
    => [{num: 1}, {num: 2}, {num: 3}]

Similar marshalling could be added for converting std::map and std::set to and from the JS analogs. Maybe this would be a helpful addition to Embind as an alternative to register_vector and register_map?

bvibber commented 3 years ago

Be aware that if you copy a vector or map into a JavaScript Array or Map, the new JavaScript objects will not forward modifications of their contents to the original C++ objects -- this would be a change in capabilities as well as an API change!

It sounds like what might be useful, though, is making the JS object wrappers for vectors and maps support native JS iteration and property indexing to make them easier to use on the JS side?

Iteration (for for-of loops and anything that takes an iterable) should be straightforward setting a special method on Symbol.iterator

I'm less certain whether property indexing (accessing as vector[index] instead of vector.get(index)) is doable without using a Proxy, which might have performance implications.

Note that neither of these features would work in Internet Explorer or other very old JS engines.

benjamind commented 3 years ago

From a performance perspective would an array bound type be more performant than having to iterate the vector?

I'm assuming there's probably some vector size performance trade off here?

mmarczell-graphisoft commented 2 years ago

@mattbradley I'm experiencing a problem with the vector conversion code you have supplied above.

seems to fix it.

chenzx commented 2 years ago

I've tried embind's value_array for passing JS Number Array to C++ struct, but the problem is, the legacy JS code's array is not fixed length, so i switch to std::vector, but result in a weird "BindingError: Cannot pass "0,0,0,1" as a vector

So what i want is a vary-length std::vector as value_array...

chenzx commented 2 years ago

The above custom marshalling code can map JS array to C++ side std::vector, but cannot map C++ side std::vector return value to JS array, compile error...

ZheyangSong commented 1 year ago

You should be able to define implicit bindings for any std::vector using custom marshalling. Something like this:

namespace emscripten {
namespace internal {

template <typename T, typename Allocator>
struct BindingType<std::vector<T, Allocator>> {
    using ValBinding = BindingType<val>;
    using WireType = ValBinding::WireType;

    static WireType toWireType(const std::vector<T, Allocator> &vec) {
        return ValBinding::toWireType(val::array(vec));
    }

    static std::vector<T, Allocator> fromWireType(WireType value) {
        return vecFromJSArray<T>(ValBinding::fromWireType(value));
    }
};

template <typename T>
struct TypeID<T,
              typename std::enable_if_t<std::is_same<
                  typename Canonicalized<T>::type,
                  std::vector<typename Canonicalized<T>::type::value_type,
                              typename Canonicalized<T>::type::allocator_type>>::value>> {
    static constexpr TYPEID get() { return TypeID<val>::get(); }
};

}  // namespace internal
}  // namespace emscripten

This will automatically convert a JS array to a std::vector (for C++ function parameters) and a std::vector to a JS array (for C++ return values) without having to mess with register_vector as long as the T type in std::vector<T> has bindings defined.

struct NumWrapper {
    double num;
};

std::vector<NumWrapper> sort(std::vector<NumWrapper> nums) {
    std::sort(nums.begin(), nums.end(), [](const NumWrapper &a, const NumWrapper &b) {
        return a.num < b.num;
    });

    return nums;
}

EMSCRIPTEN_BINDINGS(some_module) {
    value_object<NumWrapper>("NumWrapper").field("num", &NumWrapper::num);
    function("sort", &sort);

    // `register_vector<NumWrapper>` isn't needed; vectors are implicitly converted to and from JS arrays.
}
Module.sort([{num: 2}, {num: 1}, {num: 3}]);
    => [{num: 1}, {num: 2}, {num: 3}]

Similar marshalling could be added for converting std::map and std::set to and from the JS analogs. Maybe this would be a helpful addition to Embind as an alternative to register_vector and register_map?

It'd be really nice to consider this approach to ease the use of std::vector with some heads-up from @brion .

Or, is it better to consider using pointers and array size or using a shared buffer from performance's point of view? @brion

MatthieuMv commented 2 months ago

You should be able to define implicit bindings for any std::vector using custom marshalling. Something like this:

namespace emscripten {
namespace internal {

template <typename T, typename Allocator>
struct BindingType<std::vector<T, Allocator>> {
    using ValBinding = BindingType<val>;
    using WireType = ValBinding::WireType;

    static WireType toWireType(const std::vector<T, Allocator> &vec) {
        return ValBinding::toWireType(val::array(vec));
    }

    static std::vector<T, Allocator> fromWireType(WireType value) {
        return vecFromJSArray<T>(ValBinding::fromWireType(value));
    }
};

template <typename T>
struct TypeID<T,
              typename std::enable_if_t<std::is_same<
                  typename Canonicalized<T>::type,
                  std::vector<typename Canonicalized<T>::type::value_type,
                              typename Canonicalized<T>::type::allocator_type>>::value>> {
    static constexpr TYPEID get() { return TypeID<val>::get(); }
};

}  // namespace internal
}  // namespace emscripten

This will automatically convert a JS array to a std::vector (for C++ function parameters) and a std::vector to a JS array (for C++ return values) without having to mess with register_vector as long as the T type in std::vector<T> has bindings defined.

struct NumWrapper {
    double num;
};

std::vector<NumWrapper> sort(std::vector<NumWrapper> nums) {
    std::sort(nums.begin(), nums.end(), [](const NumWrapper &a, const NumWrapper &b) {
        return a.num < b.num;
    });

    return nums;
}

EMSCRIPTEN_BINDINGS(some_module) {
    value_object<NumWrapper>("NumWrapper").field("num", &NumWrapper::num);
    function("sort", &sort);

    // `register_vector<NumWrapper>` isn't needed; vectors are implicitly converted to and from JS arrays.
}
Module.sort([{num: 2}, {num: 1}, {num: 3}]);
    => [{num: 1}, {num: 2}, {num: 3}]

Similar marshalling could be added for converting std::map and std::set to and from the JS analogs. Maybe this would be a helpful addition to Embind as an alternative to register_vector and register_map?

The issue with this approach is that it breaks typescript generation because the vector is now recognised as an 'any' type. It would be very great to expose C++ containers as typed typescript arrays: this allow seamless interoperability.

Did anyone succeed to generate such typed arrays ?

mmarczell-graphisoft commented 2 months ago

@MatthieuMv

The issue with this approach is that it breaks typescript generation because the vector is now recognised as an 'any' type. It would be very great to expose C++ containers as typed typescript arrays: this allow seamless interoperability.

Did anyone succeed to generate such typed arrays ?

Before Emscripten has added TypeScript support I wrote my own d.ts generator which handles that case:

https://github.com/marczellm/emscripdtsgen

MatthieuMv commented 2 months ago

@MatthieuMv

The issue with this approach is that it breaks typescript generation because the vector is now recognised as an 'any' type. It would be very great to expose C++ containers as typed typescript arrays: this allow seamless interoperability. Did anyone succeed to generate such typed arrays ?

Before Emscripten has added TypeScript support I wrote my own d.ts generator which handles that case:

https://github.com/marczellm/emscripdtsgen

Thank you @mmarczell-graphisoft, I ended up registering a emscripten::val type for each container I have using EMSCRIPTEN_DECLARE_VAL_TYPE. Then, I registered them using emscripten::register_type<ObjectList>("Object []"); . Now, because I wanted to use my containers in the interfaces / objects, I used emscripten::internal::BindingType to convert from C++ to the registered EMSCRIPTEN_DECLARE_VAL_TYPE.

This allows me register and use these containers as properties and as function parameters.