PDLPorters / pdl

Scientific computing with Perl
http://pdl.perl.org
Other
91 stars 46 forks source link

Make PDL be a C library with a Perl interface so it can be used from other C (or dynamic language) code #358

Open mohawk2 opened 2 years ago

mohawk2 commented 2 years ago

Tasks:

This is somewhat connected to the #349 ideas on making a broadcastloop vtable, but only somewhat.

zmughal commented 2 years ago

PDL should also allow for custom allocators so that they can be used for situations where different allocators can be more efficient or integrate better with an existing external library.

For example, aligned memory (e.g., through C11's aligned_alloc, Windows-specific _aligned_malloc) can give a significant performance boost because it allows for using particular SSE/AVX instructions[*].

As discussed in IRC, I would like to be able to set this at runtime per-ndarray (internal C interface).

A related question is providing a high-level way to indicate that the output of operations between ndarrays (either another aligned-alloc-N-bytes ndarray or a regular-malloc ndarray) should be into a PDL that uses a specific allocator (my goal is that aligned-alloc-N-bytes ndarray would output to another aligned-alloc-N-bytes ndarray).

[*] Taking advantage of SSE/AVX for particular PDL ops is also something to look into and is likely a whole big project on its own.

mohawk2 commented 2 years ago

Would a way to get close(?) to this with current PDL be to make a PDL subclass (adjusting your suggested name to PDL::Aligned) which allocated memory in a suitable alignment, and set PDL_DONTTOUCHDATA? An ndarray of the appropriate size could be constructed using current code and passed as the output ndarray(s) of given operations.

An alternative approach might be just to use MALLOCDBG in the PDL config so that all memory is allocated with an alignment, in some way, or other means of globally setting allocate/free.

mohawk2 commented 2 years ago

SSE/AVX utilisation might be better captured on #349. Notes should include pointers (ha!) on how to do so from the C level.