bennyguo / instant-nsr-pl

Neural Surface reconstruction based on Instant-NGP. Efficient and customizable boilerplate for your research projects. Train NeuS in 10min!
MIT License
856 stars 84 forks source link

Runtime error in cudaGraphExecUpdate() from tiny-cuda-nn #23

Closed morsingher closed 1 year ago

morsingher commented 1 year ago

Hi, thanks for your awesome work. I get a weird error during validation: terminate called after throwing an instance of 'std::runtime_error' what(): /tmp/pip-req-build-z4954kz1/include/tiny-cuda-nn/cuda_graph.h:124 cudaGraphExecUpdate(m_graph_instance, m_graph, &error_node, &update_result) failed with error the graph update was not performed because it included changes which violated constraints specific to instantiated graph update Aborted Do you know what causes this problem and how to solve it? Thank you in advance!

morsingher commented 1 year ago

A bit more context, if it can help: this happens with V100 cards, but not with Titan Xp cards. In both cases, CUDA 11.6 is installed.

bennyguo commented 1 year ago

Hi! What PyTorch version are you using? Try to update PyTorch to the latest version, and re-install tiny-cuda-nn.

morsingher commented 1 year ago

Hi, I have PyTorch 1.13 with CUDA 11.6 in both cases, as well as the latest tiny-cuda-nn. I'm not sure why the error appears in one server and not in the other one. I have also opened an issue on the tiny-cuda-nn repository about this.

morsingher commented 1 year ago

After some debugging, the problem seems to be caused by nerfacc rendering function (again, only with V100 cards). If I replace the RGB values with random numbers, everything works fine. I have no idea what causes this issue, but I will dig deeper into it. Closing for now, as it is not related to your repository.