NVIDIA / cccl

CUDA Core Compute Libraries
https://nvidia.github.io/cccl/
Other
1.27k stars 163 forks source link

FileCheck validation fails for some examples when building with OMP/TBB host systems. #740

Open alliepiper opened 4 years ago

alliepiper commented 4 years ago

Background

The Thrust examples are currently validated by comparing their output to a reference output using the LLVM FileCheck utility. This additional validation is not enabled by default, and as a result the ctest tests that run example code cannot fail by default.

Failures

After enabling FileCheck, examples start failing on configurations with a non-cpp host:

        6844 - thrust.omp.cpp.cpp11.example.bucket_sort2d (Failed)
        6880 - thrust.omp.cpp.cpp11.example.sum (Failed)
        6895 - thrust.omp.cpp.cpp14.example.bucket_sort2d (Failed)
        6931 - thrust.omp.cpp.cpp14.example.sum (Failed)
        6946 - thrust.omp.cpp.cpp17.example.bucket_sort2d (Failed)
        6982 - thrust.omp.cpp.cpp17.example.sum (Failed)
        6997 - thrust.omp.omp.cpp11.example.bucket_sort2d (Failed)
        7033 - thrust.omp.omp.cpp11.example.sum (Failed)
        7048 - thrust.omp.omp.cpp14.example.bucket_sort2d (Failed)
        7084 - thrust.omp.omp.cpp14.example.sum (Failed)
        7099 - thrust.omp.omp.cpp17.example.bucket_sort2d (Failed)
        7150 - thrust.omp.tbb.cpp11.example.bucket_sort2d (Failed)
        7201 - thrust.omp.tbb.cpp14.example.bucket_sort2d (Failed)
        7252 - thrust.omp.tbb.cpp17.example.bucket_sort2d (Failed)
        7288 - thrust.omp.tbb.cpp17.example.sum (Failed)
        7474 - thrust.tbb.cpp.cpp11.example.bucket_sort2d (Failed)
        7525 - thrust.tbb.cpp.cpp14.example.bucket_sort2d (Failed)
        7576 - thrust.tbb.cpp.cpp17.example.bucket_sort2d (Failed)
        7627 - thrust.tbb.omp.cpp11.example.bucket_sort2d (Failed)
        7678 - thrust.tbb.omp.cpp14.example.bucket_sort2d (Failed)
        7729 - thrust.tbb.omp.cpp17.example.bucket_sort2d (Failed)
        7780 - thrust.tbb.tbb.cpp11.example.bucket_sort2d (Failed)
        7831 - thrust.tbb.tbb.cpp14.example.bucket_sort2d (Failed)
        7882 - thrust.tbb.tbb.cpp17.example.bucket_sort2d (Failed)

example.sum

The example.sum seems to have a serious issue with producing stable results:

$ for i in `seq 10`;do examples/thrust.omp.omp.cpp11.example.sum; done
sum is 509119
sum is 498850
sum is 521001
sum is 510424
sum is 527787
sum is 507166
sum is 500323
sum is 519696
sum is 524764
sum is 528394

example.bucket_sort2d

The example.bucket_sort2d failures seem to be from the output being written in non-deterministic order when the host system is not cpp.

RCCA: Replace FileCheck with more robust runtime checks

We need to address these failures and fix them, but should also reconsider the use of optional dependencies like FileCheck for testing, since these failures have done undetected for a while. Instead, the example executables should validate themselves at runtime and use their exit code to indicate failure. This will simplify our ctest harness significantly.

Open Question: DVS?

I'm not sure how difficult it would be to update DVS to expect runtime failures. Updating CMake for this would be trivial.

jrhemstad commented 1 year ago

@allisonvacanti is this still relevant? It sounds like we'd want to add the equivalent of unit test EXPECTS logic to the examples to verify they produce valid output at runtime?