Ivorforce / NumDot

Tensor math and scientific computation for the Godot game engine.
https://numdot.readthedocs.io
MIT License
19 stars 7 forks source link

Adaptations (in-place views) of native GD objects #42

Closed Ivorforce closed 3 days ago

Ivorforce commented 2 months ago

Requirements:

Example with #25:

packed = PackedFloat32Array([1, 2, 3])
tensor = nd.as_array(packed)
tensor.add(tensor, 2)
tensor.multiply(tensor, tensor)
print(packed) # Prints [9, 16, 25]

Example without #25:

packed = PackedFloat32Array([1, 2, 3])
tensor = nd.array(packed) # Copy
tensor.add(tensor, 2)
tensor.multiply(tensor, tensor)
packed = tensor.to_packed_float32_array() # Copy
print(packed) # Prints [9, 16, 25]

The additional conversion could be slow enough to defeat using NumDot in the first place, so this feature could open up a whole new range of lightweight processing possibilities. It would essentially be the final 2 copies standing in the way of using NumDot for all high performance calculations.

Ivorforce commented 1 month ago

8 is implemented, and it ended up with xarray_adaptor objects. This should enable seamless integration for in-place gd object manipulation, even without adding new ComputeVariant cases and increasing the binary size. We'd only need a StoreVariant for every gd object (holding a reference to it (not sure yet how that would work)), and a way to convert it to a ComputeVariant, similar to the existing function.

Ivorforce commented 1 month ago

I had a look at this today. Looks like all Packed types are copy-on-write when exchanged with a gdextension. This may be worth a godot proposal in the future, but for now, it means could only look at Packed types in-place, not write to. Still, we could make use of that.

A StoreVariant could be made for each Packed* type, which through copy-on-write should be instantaneous. to_compute_variant would be forked to a normal and a const variety, and upon first write access, the NDArray would implicitly fork itself off whatever it's copied from (finally resulting in a copy). The only important thing to keep in mind would be to pack it inside a shared_ptr, such that we don't make different views copies, but that should be easy to do. The fitting to_packed_x function would then be instantaneous, because it can just return the store, secure and instantaneous through copy-on-write.

All in all, I think that's still a good path to go down.

Ivorforce commented 3 days ago

This is complete now.