NVIDIAGameWorks / RayTracingDenoiser

NVIDIA Ray Tracing Denoiser
Other
513 stars 46 forks source link

SSE instruction throws illegal instruction. #21

Closed Hearwindsaying closed 3 years ago

Hearwindsaying commented 3 years ago

Is there any requirements for CPU? My CPU is 2 E5-2665, which should support SSE instructions. For any sample scene, the program always crashes at these SSE instructions: 捕获

Hearwindsaying commented 3 years ago

It turns out that my CPU does not support AVX2. Illegal Instruction is thrown consequently. Is there any fallback layer for SSE? Thanks!

dzhdanNV commented 3 years ago

define PLATFORM_INTRINSIC_SSE3 0 // NOTE: +SSSE3

define PLATFORM_INTRINSIC_SSE4 1

define PLATFORM_INTRINSIC_AVX1 2 // NOTE: +FP16C

define PLATFORM_INTRINSIC_AVX2 3 // NOTE: +FMA3

if (defined(_MSC_VER) && (_MSC_VER >= 1920)) || defined(clang) || defined(GNUC)

// TODO: disable __m256d emulation if VS2019 is used
#define PLATFORM_INTRINSIC                          PLATFORM_INTRINSIC_AVX2

else

#define PLATFORM_INTRINSIC                          PLATFORM_INTRINSIC_SSE4

endif

You can modify PLATFORM_INTRINSIC to something what matches your CPU.

It's in "platform.h"

Hearwindsaying commented 3 years ago

Thanks for your advice. Actually I have done this before:

define PLATFORM_INTRINSIC PLATFORM_INTRINSIC_SSE3

Or as PLATFORM_INTRINSIC_SSE4, PLATFORM_INTRINSIC_AVX1.

However, I got compiler errors about "cannot convert v4d to __m256d". (Sorry I forget the details and do not have access to my project at the moment. But the errors are all of those conversion failure in my mind.) But the original code:

define PLATFORM_INTRINSIC PLATFORM_INTRINSIC_AVX2

do compile at VS 2019 without warnings.

Any insights? Thanks!

ch45er commented 3 years ago

Apparently I am facing the same issue as @Hearwindsaying.

If you simply set PLATFORM_INTRINSIC to PLATFORM_INTRINSIC_SSE4 (or lower) when building with MSVC, a bunch of type conversion errors like this occurs:

'__m256d _mm256_sin_pd(__m256d)': cannot convert argument 1 from 'const emu__m256d' to '__m256d'

It turns out that fallback implementations for _mm256_XXX_pd functions don't get pulled in as they're surrounded with PLATFORM_HAS_SVML_INTRISICS compile time condition (see MathLib_d.h).

Disabling the SVML guard alltogether breaks single precision routines like so:

'_mm_tan_ps': ambiguous call to overloaded function

A somewhat working workaround is to disable the check just in MathLib_d.h for non-AVX builds.

However, this doesn't solve the initial issue: the example code still crashes with illegal instrunction with stack pointing to

>   09_RayTracing_NRD.exe!Zbuffer::`dynamic initializer for 'DepthNear''() Line 131 C++

in my case. Any ideas?

[EDIT]

After some more poking around I've found that AVX instruction set is also enabled in the MSVC project settings by default. Removing the /arch switch solves the issue. This has to be done for the Sample projects.

dzhdanNV commented 3 years ago

Thanks for digging into this. I hope it will be fixed in the next submit:

ch45er commented 3 years ago

Great news, thank you 👍

dzhdanNV commented 3 years ago

Updated. Please, check and close if it works as expected.

ch45er commented 3 years ago

Works fine for me.

Hearwindsaying commented 3 years ago

Works for me! Thanks to all folks!