Closed peterkim333 closed 2 years ago
Hi, I am a developer who use CNTK C# package with RTX 3090 GPU.
our team have rebuilt the CNTK C# for higher CUDA and cuDNN version. usually, it works for training and so on.
However, our program dies Irregularly during neural network learning. Usually Deep Learning is finished well. But sometimes it dies.
Error is below.
:System.AccessViolationException
Location: CNTK.CNTKLibPINVOKE.Variable__Name(System.Runtime.InteropServices.HandleRef) Location: CNTK.Variable._Name() Location: DeepLearningCore.SegmentationNetwork.GetParameters() Location: DeepLearningCore.Segmentation.TrainNetwork() Location: DeepLearningCore.Segmentation.Run() Location: System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) Location: System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) Location: System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object) Location: System.Threading.ThreadHelper.ThreadStart()
I have no idea. we try to trace the error and figure out that the error occurs when try to get the specific parameter and its name, it is failed. It is not certain, about 500 or more layers make this error. Anybody who has same problem?
Hi, I am a developer who use CNTK C# package with RTX 3090 GPU.
our team have rebuilt the CNTK C# for higher CUDA and cuDNN version. usually, it works for training and so on.
However, our program dies Irregularly during neural network learning. Usually Deep Learning is finished well. But sometimes it dies.
Error is below.
Location: CNTK.CNTKLibPINVOKE.Variable__Name(System.Runtime.InteropServices.HandleRef) Location: CNTK.Variable._Name() Location: DeepLearningCore.SegmentationNetwork.GetParameters() Location: DeepLearningCore.Segmentation.TrainNetwork() Location: DeepLearningCore.Segmentation.Run() Location: System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) Location: System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) Location: System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object) Location: System.Threading.ThreadHelper.ThreadStart()
I have no idea. we try to trace the error and figure out that the error occurs when try to get the specific parameter and its name, it is failed. It is not certain, about 500 or more layers make this error. Anybody who has same problem?