Closed fangq closed 5 years ago
hi Felipe
from the symptom (flickering screen), I am quite certain that the kernel was killed by the graphics driver TDR (Timeout Detection and Recovery). I assume that mcxcl had no problem running on the CPU target (-G 2 - using AMD driver, and -G 4 using intel driver)
Changing TdrDelay setting is the typical suggestion for fixing this on Windows. You mention you tried, but I have a feeling it was not effective. For NVIDIA GPU on windows, you can use a Nsight option dialogue to disable TDR (or change delay time), see
https://docs.nvidia.com/gameworks/content/developertools/desktop/timeout_detection_recovery.htm
but I am not sure if this method works for AMD or Intel GPU. maybe you want to give it a try?
Another thing to try is to disable the progress bar (remove -D P or uncheck Show progress bar). The progress bar feature is not very stable and sometimes can cause hanging (host keeps waiting despite the kernel has completed).
Dear Prof. Fang, Thank you so much for sharing MCX. Our experience in the last few years with NVIDIA/CUDA based MCX has been excellent. Please find below details of what seems to be an issue with MCX-CL. I'm more than happy to try any suggestions that you may have. Hopefully we are not missing something very obvious (or perhaps, it would be better if we are!). Best, Felipe
Problem description:
MCX-CL freezes on AMD Radeon R5 430 / Intel (R) HD Graphics 630 based computer when number of photons is bigger than 1.5e6.
Further details:
Computer configuration:
TRIED AND CHECKLIST DONE SO FAR:
** RUN FROM MATLAB COMMAND LINE
%= OUTPUT ==========================================================
MCX-CL-freeze_FreezedRunOnMatlab.png
** RUN ON MCXSTUDIO GUI
MCX-CL-freeze_ValidationOnMCXStudioGUI.png
MCX-CL-freeze_FreezedRunOnMCXStudioGUI.png
** RUN ON CMD CONSOLE