Open ntkylin opened 3 months ago
additionally, by crashing happened, the GPU seems full loaded with a full speed of its cooling fan. After reboot, the system cannot be correctly start just can only show BIOS interface. When shutdown manually and restart again, it can enter the Windows system, but the driver of GPU2=7900GRE is gone.
Thanks for the report and thanks for the contributions! Unfortunately, it sounds like your GPU may have malfunctioned. The path mentioned in an error is a path to the source code on my machine for help debugging, which is expected since that's where it was compiled. The error indicates that your GPU (possibly due to overheating, or just a random unpredictable failure) might have started to return incorrect numbers during its computation.
I'm really sorry for your trouble - if you get your system working again, I would recommend against contributing further using that machine, to avoid stressing the GPU, and because if the GPU starts to return incorrect values after running enough time it might start to result in low-quality data that isn't useful for training anyways.
Hi, katago contribution routines crashed in my Windows11 22H2 system after about 2700 games training/rating with following error:
Unfortunately I cannot find that cpp file on the above path, do you know what's wrong?
GPU is AMD 7900GRE, with driver version Adrenalin 23.7.2 (WHQL Recommended). And the log file is show below: