Open ax3l opened 5 years ago
Testing on a GTX 950M, I get this while running PIConGPU
:
</home/berceanu/src/spack/opt/spack/linux-ubuntu18.04-x86_64/gcc-7.3.0/picongpu-0.4.0-lqbxwsudtgms2do4ksm57uovvv4ypx4e/thirdParty/cuda_memtest/misc.cpp>:35
It seems to be just a warning, as the simulation completes after that.
See that disabling the memtest fixes it:
pic-build -b "cuda:50" -c "-DCUDAMEMTEST_ENABLE=OFF"
Should we add a known issue in the docs for non-tesla cards?
Thx for the report! Can you please post the warning? Is there a line missing?
Nope, there is only that single line.
Ah ok, but it does not abort, yes!
Ok, we have to clean up that macro, it should not randomly start to write to cerr
:
https://github.com/ComputationalRadiationPhysics/cuda_memtest/blob/7a585d504831431d0e95ff00d0217181201dbb12/cuda_memtest.h#L146-L150
I proposed a fix in #18 that should remove that noisy line from your output. It can (rightfully) be ignored.
Nvidia NVML does not support non-Tesla product very well. Problems are known with mobile cards and even Quadro cards. (Reported as RFE to Nvidia as Bug ID 2417658.)
Anyway, this can lead to
cuda_memtest
throwing an[NVML] Error: Not supported
(innvmlDeviceGetSerial
) exception which we should catch.