Prior to this commit, an exception thrown during the capture of a cuda graph would result in std::terminate being called. This commit updates the implementation of "vm.builtin.cuda_graph.run_or_capture" such that a thrown exception can be recovered from, and does not cause any changes to the state of TVM's cuda graph cache.
Call to cudaStreamDestroy was previously skipped, now moved to a RAII-style destructor in a ScopedCUDAStream class.
Call to cudaStreamEndCapture was previously skipped, end of cuda graph capture now performed as part of RAII-style destructor for CUDACaptureStream class.
Restoration of CUDAThreadEntry::ThreadLocal()->stream was previously skipped, now restored as part of RAII-style destructor for CUDACaptureStream class.
Previously, an error raised from cudaGraphInstantiate would leave the capture_cache_ in an ill-formed state. Now, the capture_cache_ is only updated after a valid CUDAGraphCapturedState has been fully constructed.
Prior to this commit, an exception thrown during the capture of a cuda graph would result in
std::terminate
being called. This commit updates the implementation of"vm.builtin.cuda_graph.run_or_capture"
such that a thrown exception can be recovered from, and does not cause any changes to the state of TVM's cuda graph cache.Call to
cudaStreamDestroy
was previously skipped, now moved to a RAII-style destructor in aScopedCUDAStream
class.Call to
cudaStreamEndCapture
was previously skipped, end of cuda graph capture now performed as part of RAII-style destructor forCUDACaptureStream
class.Restoration of
CUDAThreadEntry::ThreadLocal()->stream
was previously skipped, now restored as part of RAII-style destructor forCUDACaptureStream
class.Previously, an error raised from
cudaGraphInstantiate
would leave thecapture_cache_
in an ill-formed state. Now, thecapture_cache_
is only updated after a validCUDAGraphCapturedState
has been fully constructed.