Closed CedricDViou closed 2 years ago
Hi Cedric, thanks for the report. We have sometimes seen failures such as these that are dependent on GPU card and architecture settings. Can you provide details about your GPU hardware (output of nvidia-smi
for example) and the output of ./configure
(or the config.log
file)?
Thanks for the quick feed back.
| NVIDIA-SMI 515.43.04 Driver Version: 515.43.04 CUDA Version: 11.7 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Quadro T2000 On | 00000000:01:00.0 Off | N/A |
| N/A 40C P0 17W / N/A | 10MiB / 4096MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
configure_stdout.txt config.log
I hope this helps.
It's just these 3 failures? They're all using complex-to-real transforms (which I failed to notice when looking at it on my phone this morning), and there is a known issue with C2R in cufft on certain cards and/or certain CUDA versions, so I guess we're hitting it here. I'll gather together some possibly-related info we've stumbled across… hope we can find a work-around for this one, but Jayce will know more.
Yes, just these 3 failures. I guess my install is mostly fine then and that I can play with the tutorials. Thanks for your feedback.
Hello, I'm starting to use bifrost and I'm happily starting with the tutorials. However, just out of curiosity, to check my install, I run
make test
. Many passed butFAILED (failures=3, skipped=4)
FAIL: test_c2r_1D (test_fft.TestFFT) Traceback (most recent call last): File "/home/cedric/tmp/bifrost/test/test_fft.py", line 206, in test_c2r_1D self.run_test_c2r(self.shape1D, [0]) File "/home/cedric/tmp/bifrost/test/test_fft.py", line 148, in run_test_c2r self.run_test_c2r_impl(shape, axes) File "/home/cedric/tmp/bifrost/test/test_fft.py", line 141, in run_test_c2r_impl compare(odata.copy('system'), known_result) File "/home/cedric/tmp/bifrost/test/test_fft.py", line 51, in compare np.testing.assert_allclose(result, gold, rtol=RTOL, atol=MTOL * absmean) File "/home/cedric/anaconda3/lib/python3.8/site-packages/numpy/testing/_private/utils.py", line 1528, in assert_allclose assert_array_compare(compare, actual, desired, err_msg=str(err_msg), File "/home/cedric/anaconda3/lib/python3.8/site-packages/numpy/testing/_private/utils.py", line 842, in assert_array_compare raise AssertionError(msg) AssertionError: Not equal to tolerance rtol=0.1, atol=0.00462357 Mismatched elements: 1 / 16777216 (5.96e-06%) Max absolute difference: 0.01639435 Max relative difference: 1.93911746
FAIL: test_c2r_2D (test_fft.TestFFT) AssertionError: Not equal to tolerance rtol=0.1, atol=0.00231149 Mismatched elements: 4186048 / 4194304 (99.8%) Max absolute difference: 492620.22392237 Max relative difference: 39550830.05759069
FAIL: test_c2r_3D (test_fft.TestFFT) AssertionError: Not equal to tolerance rtol=0.1, atol=0.00163087 Mismatched elements: 2080441 / 2097152 (99.2%) Max absolute difference: 88917.37869481 Max relative difference: 6823521.1182115
This was tested on Ubuntu 20.04.4 LTS with Python 3.8.8.
Tell me if I can help. Regards, Cedric