Open simonlegrand opened 1 year ago
What platform are you running on, and do you have another instruction set enabled beyond generic256 ? In particular, alignment requirements may not support combining generic{128,256} with others. For instance, gcc will produce 'movapd' for ymm registers on x86-64 for generic-simd256's LDA macro, but enabling any x86-64 SIMD ISA will set ALIGNMENTA to 16... causing segfaults if the movapd see an address aligned on a 16-bytes, not a a 32-bytes, boundary.
Hello Romain,
I'm sorry for the delayed answer. I performed those tests on my laptop equipped with an intel i7-8650U. I just figured out that the problem appear when combining both --enable-avx2 and --enable-generic-simd256.
Regards,
Simon
'generic-simd' should only be used on its own when no specific support for the architecture exists. If you use avx2, generic-simd offers no benefits.
Thank you for those details. Could it be possible to make those options mutually exclusive to avoid that kind of behavior?
Expected result: Identical results whatever the number of threads.
The multithreaded algorithm may be different from the serial algorithm in ways that affect floating-point errors. Even with the serial algorithm, if you use the planner then the roundoff errors may differ (see the FAQ).
To assess whether the differences you are talking about are roundoff errors, it would be helpful to compute the L2 relative error, i.e. as described here: ‖data2 - data1‖₂ / ‖data1‖₂
Hello,
This issue might be related to #294.
3.3.10
I experienced what seems to be a bug when compiling with
--generic-256simd
option. I obtain different results when running in sequential and when running with multiple threads (openmp).Identical results whatever the number of threads.
Here is a piece of code that allows to reproduce the problem. I put identical data into data1 and data2 and perform a sequential transform on the first, and a 4 threads transform on the the second. Then I get the indices where results are differents and print the corresponding values:
Here is a sample of the output with the
`--generic-256simd
option:and the output without the
--generic-256simd
option:I tried smaller domain sizes (128³, 64³ ...) and the problem is not visible.
Best,
Simon