Open nordlow opened 3 weeks ago
There are definitely spots where platform-specific performance tuning can be a great idea. This can be a topic for a proper research project, potentially exposing threshold variables with extern
and tuning our sz_find
, sz_copy
, and other kernels for the target platform. It get's a bit trickier, if the same kernel depends on multiple variables and we are forced to grid-search...
Are you familiar with similar projects, @nordlow?
Unfortunately not.
Thanks for the response anyway.
Describe what you are looking for
For a given function
f
, are there any (typically architecture-specific) heuristics for when a SIMD version opposite to a serial version of an algorithm should be chosen?For instance, for
one such heuristics could be
.
Can you contribute to the implementation?
Is your feature request specific to a certain interface?
It applies to everything
Contact Details
No response
Is there an existing issue for this?
Code of Conduct