arduano / simdeez

easy simd
MIT License
331 stars 25 forks source link

Fix example formatting and remove inline attribute #1

Closed jonas-schievink closed 6 years ago

jonas-schievink commented 6 years ago

target_feature attributes work independently of inline attributes, and #[inline(always)] just forces LLVM to inline calls even in debug mode.

Also it would be great if the example would explain why all the functions are marked as unsafe.

jackmott commented 6 years ago

I have confirmed the inline needs to be there. What happens is if the generic form of the function does not get inlined into the function with the target_feature attribute, then it doesn't get the target_feature applied. So for instance the AVX2 intrinsics get downgraded by the compiler to SSE2 equivalents.

If you have doubts/questions let me know.

jonas-schievink commented 6 years ago

Interesting. How exactly did you confirm? With or without optimizations? LLVM should definitely inline the call as it's the only one for each specific instantiation of distance.

jackmott commented 6 years ago

one way is just to yank it off, run a benchmark, and see it get ~5x slower with optimizations on full. You could build up a test case in https://godbolt.org/ as well and see the assembler. Now that I think about it the test I was doing was with a 'pub' function, maybe if it were private it would tend to inline it.

jackmott commented 6 years ago

For now I'm going to merge this and put the inline back, but we can investigate further.