halide / Halide

a language for fast, portable data-parallel computation
https://halide-lang.org
Other
5.89k stars 1.07k forks source link

d3d12compute test failures #3909

Closed steven-johnson closed 4 years ago

steven-johnson commented 5 years ago

As of current master (0cea04b5cdcfdfdc129d77f27e551aaafb244326), app/bilateral_grid is failing when using HL_TARGET=host-d3d12compute. For a sample failure, see https://buildbot.halide-lang.org/master/#/builders/62/builds/241/steps/26/logs/stdio (this is of a testbranch but I verified separately that recent master branch fails on that machine as well).

No idea yet if this is a bug injection in Halide, or something specific to that machine (winbot-1), or some driver hickey on the machine, but it needs to be investigated.

slomp commented 5 years ago

I'll see if I can have a look at it next week, If it's a blocker, we can perhaps disable the test for now and add it to: https://github.com/halide/Halide/issues/3586

steven-johnson commented 5 years ago

Not a blocker, just noise in testing that has to be ignored

steven-johnson commented 5 years ago

Disabling it on the buildbots until we get a chance to debug it

slomp commented 5 years ago

I'm getting this when I try to compile the bilateral_grid.run project in MSVC:

1>c:\program files (x86)\microsoft visual studio\2017\community\vc\tools\msvc\14.16.27023\include\random(2401): error C2338: invalid template argument for uniform_int_distribution: N4659 29.6.1.1 [rand.req.genl]/1e requires one of short, int, long, long long, unsigned short, unsigned int, unsigned long, or unsigned long long
1>halide-master_llvm-80\source\halide\tools\rungen.h(485): note: see reference to class template instantiation 'std::uniform_int_distribution<T2>' being compiled
1>        with
1>        [
1>            T2=int8_t
1>        ]
1>halide-master_llvm-80\source\halide\tools\rungen.h(479): note: see reference to function template instantiation 'void Halide::RunGen::FillWithRandom<int8_t>::fill<int8_t,0x0>(Halide::Runtime::Buffer<int8_t,4> &,std::mt19937 &)' being compiled
1>halide-master_llvm-80\source\halide\tools\rungen.h(479): note: see reference to function template instantiation 'void Halide::RunGen::FillWithRandom<int8_t>::fill<int8_t,0x0>(Halide::Runtime::Buffer<int8_t,4> &,std::mt19937 &)' being compiled
1>halide-master_llvm-80\source\halide\tools\rungen.h(476): note: while compiling class template member function 'void Halide::RunGen::FillWithRandom<int8_t>::operator ()(Halide::Runtime::Buffer<void,4> &,int)'
1>halide-master_llvm-80\source\halide\tools\rungen.h(188): note: see reference to function template instantiation 'void Halide::RunGen::FillWithRandom<int8_t>::operator ()(Halide::Runtime::Buffer<void,4> &,int)' being compiled
1>halide-master_llvm-80\source\halide\tools\rungen.h(188): note: see reference to class template instantiation 'Halide::RunGen::FillWithRandom<int8_t>' being compiled
1>halide-master_llvm-80\source\halide\tools\rungen.h(678): note: see reference to function template instantiation 'void Halide::RunGen::dynamic_type_dispatch<Halide::RunGen::FillWithRandom,Halide::Runtime::Buffer<void,4>&,int&>(const halide_type_t &,Halide::Runtime::Buffer<void,4> &,int &)' being compiled
1>c:\program files (x86)\microsoft visual studio\2017\community\vc\tools\msvc\14.16.27023\include\random(2401): error C2338: note: char, signed char, unsigned char, int8_t, and uint8_t are not allowed

Apparently, uniform_int_distribution with 8bit integers is not implemented in the MSVC stdlib...

steven-johnson commented 5 years ago

The uniform_int_distribution was just fixed. Would be nice to be able to circle back to this to get testing for d3d12 enabled again.

shoaibkamil commented 5 years ago

@slomp is at SIGGRAPH, but I'll ping him next week to see if we can figure it out if any problems still persist.

slomp commented 5 years ago

I'm still getting the following:

2>C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\VC\Tools\MSVC\14.22.27905\include\random(1925,1): error C2338:  invalid template argument for uniform_int_distribution: N4659 29.6.1.1 [rand.req.genl]/1e requires one of short, int, long, long long, unsigned short, unsigned int, unsigned long, or unsigned long long
2>halide-master_llvm-80\source\halide\tools\RunGen.h(489): message :  see reference to class template instantiation 'std::uniform_int_distribution<T2>' being compiled
2>halide-master_llvm-80\source\halide\tools\RunGen.h(489): message :         with
2>halide-master_llvm-80\source\halide\tools\RunGen.h(489): message :         [
2>halide-master_llvm-80\source\halide\tools\RunGen.h(489): message :             T2=int8_t
2>halide-master_llvm-80\source\halide\tools\RunGen.h(489): message :         ]
2>halide-master_llvm-80\source\halide\tools\RunGen.h(479): message :  see reference to function template instantiation 'void Halide::RunGen::FillWithRandom<int8_t>::fill<int8_t,0x0>(Halide::Runtime::Buffer<int8_t,4> &,std::mt19937 &)' being compiled
2>halide-master_llvm-80\source\halide\tools\RunGen.h(479): message :  see reference to function template instantiation 'void Halide::RunGen::FillWithRandom<int8_t>::fill<int8_t,0x0>(Halide::Runtime::Buffer<int8_t,4> &,std::mt19937 &)' being compiled
2>halide-master_llvm-80\source\halide\tools\RunGen.h(476): message :  while compiling class template member function 'void Halide::RunGen::FillWithRandom<int8_t>::operator ()(Halide::Runtime::Buffer<void,4> &,int)'
2>halide-master_llvm-80\source\halide\tools\RunGen.h(188): message :  see reference to function template instantiation 'void Halide::RunGen::FillWithRandom<int8_t>::operator ()(Halide::Runtime::Buffer<void,4> &,int)' being compiled
2>halide-master_llvm-80\source\halide\tools\RunGen.h(188): message :  see reference to class template instantiation 'Halide::RunGen::FillWithRandom<int8_t>' being compiled
2>halide-master_llvm-80\source\halide\tools\RunGen.h(702): message :  see reference to function template instantiation 'void Halide::RunGen::dynamic_type_dispatch<Halide::RunGen::FillWithRandom,Halide::Runtime::Buffer<void,4>&,int&>(const halide_type_t &,Halide::Runtime::Buffer<void,4> &,int &)' being compiled
2>C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\VC\Tools\MSVC\14.22.27905\include\random(1925,1): error C2338:  note: char, signed char, unsigned char, char8_t, int8_t, and uint8_t are not allowed
steven-johnson commented 5 years ago

Sigh -- I'm gonna guess that MSVC defines int8_t as signed char, so we probably need a case for that too.

steven-johnson commented 5 years ago

Since we think we've address the underlying issue, I'm going to re-enable D3D12 testing on the buildbots.

steven-johnson commented 5 years ago

Reassigning to myself so I can close once the buildbots get thru that phase

steven-johnson commented 5 years ago

After re-enabling the backend, we found:

Re-assigning to @shoaibkamil for triage

steven-johnson commented 4 years ago

Any idea where we stand on this? D3D12Compute testing has been disabled on the buildbots for months now.

steven-johnson commented 4 years ago

(And I don't have a local Windows machine on which to test. If someone who does could run test_correctness, test_generator, and test_apps with host-d3d12compute, I would appreciate it.)

slomp commented 4 years ago

I'll see if I can run these tests on Friday!

slomp commented 4 years ago

Hmmm, I'm trying to run:

$ ctest -L correctness --output-on-failure

but I keep getting this:

Test project /d/b/build/halide/origin/master/msvc/Win64/Release
No tests were found!!!

Do I have to call ctest from a particular folder? I tried a few here to no avail.

steven-johnson commented 4 years ago

you have to also specify -C Release {or Debug} for Windows.

It's a very unhelpful error message.

slomp commented 4 years ago

Here are the results:

The following tests FAILED:
         91 - correctness_dynamic_allocation_in_gpu_kernel (Failed)
        143 - correctness_gpu_mixed_shared_mem_types (Failed)
        151 - correctness_gpu_reuse_shared_memory (Failed)
        204 - correctness_math (Failed)
        220 - correctness_newtons_method (Failed)

I think that 91, 143 and 151 might be related.

correctness_math is failing on pow() tests:

relatively_equal failed for (0.0001, -nan(ind)) with relative error nan
For pow(-10.00000000000000000000, -4.00000000000000000000) == 0.00009999999747378752 from C and -nan(ind) from x86-64-windows-d3d12compute-jit.
relatively_equal failed for (-0.00237037, -nan(ind)) with relative error nan
For pow(-7.50000000000000000000, -3.00000000000000000000) == -0.00237037031911313534 from C and -nan(ind) from x86-64-windows-d3d12compute-jit.
relatively_equal failed for (0.04, -nan(ind)) with relative error nan
For pow(-5.00000000000000000000, -2.00000000000000000000) == 0.03999999910593032837 from C and -nan(ind) from x86-64-windows-d3d12compute-jit.
relatively_equal failed for (-0.4, -nan(ind)) with relative error nan
For pow(-2.50000000000000000000, -1.00000000000000000000) == -0.40000000596046447754 from C and -nan(ind) from x86-64-windows-d3d12compute-jit.
relatively_equal failed for (1, -nan(ind)) with relative error nan
For pow(0.00000000000000000000, 0.00000000000000000000) == 1.00000000000000000000 from C and -nan(ind) from x86-64-windows-d3d12compute-jit.

correctness_newton is generating a shader that the dxc compiler complains there is potential for division by zero. Other than that, the result seems to be correct, but with a much lower precision:

Incorrect results: 3.14159274101257324219 3.14159250259399414063 3.14159274101257324219
slomp commented 4 years ago

91, 143, and 151: I think these are the same tests that were failing originally when we disable the tests.

correctness_math has been acknowledged that it may fail: https://github.com/halide/Halide/issues/3909#issuecomment-529689640 but NaN is strange!

not sure what's the correct course of action for correctness_newton though

slomp commented 4 years ago

About the NaN in pow(): https://sakibsaikia.github.io/graphics/2017/01/19/Hunting-Down-NaNs.html

"The thing to note here is that pow(x,y) has been changed to exp(y * log(x)). This is because exp() and log() are quarter rate instructions4,5. Which means that x needs to be a positive non-zero value, otherwise the result is undefined. Turns out SM3 implementation of log() already performs this high level logic 6 for us which suppresses any NaNs."

"In short, for SM5 the intrinsic functions do just what you ask them to do - nothing more. It is up to you to feed correct values to these functions or make sure you saturate() or clamp() your values to the correct range."

shoaibkamil commented 4 years ago

Fixed by #5003