cramerlab / warp

GNU General Public License v3.0
54 stars 16 forks source link

M Crashes during species import #21

Open heejongkim opened 4 years ago

heejongkim commented 4 years ago

After finishing the dialog, during the process, I got the following crash log.

7/25/2020 7:42:29 AM: TensorFlow.TFException: OOM when allocating tensor with shape[4,96,64,64,64] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [[Node: batch_normalization_1/batchnorm/add_1 = Add[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](batch_normalization_1/batchnorm/mul_1, batch_normalization_1/batchnorm/sub)]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

 [[Node: l2_loss/_445 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_9125_l2_loss", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

at TensorFlow.TFStatus.CheckMaybeRaise(TFStatus incomingStatus, Boolean last) in D:\Dev\warp2\WarpLib\TensorFlowSharp\Tensorflow.cs:line 351 at TensorFlow.TFSession.Run(TFOutput[] inputs, TFTensor[] inputValues, TFOutput[] outputs, TFOperation[] targetOpers, TFBuffer runMetadata, TFBuffer runOptions, TFStatus status) in D:\Dev\warp2\WarpLib\TensorFlowSharp\Tensorflow.cs:line 2681 at TensorFlow.TFSession.Runner.Run(TFStatus status) in D:\Dev\warp2\WarpLib\TensorFlowSharp\Tensorflow.cs:line 2592 at Warp.NoiseNet3D.Train(Single[] source, Single[] target, Single learningRate, Int32 threadID, Single[]& prediction, Single[]& loss) in D:\Dev\warp2\WarpLib\NoiseNet3D.cs:line 182 at Warp.NoiseNet3D..ctor(String modelDir, int3 boxDimensions, Int32 nThreads, Int32 batchSize, Boolean forTraining, Int32 deviceID) in D:\Dev\warp2\WarpLib\NoiseNet3D.cs:line 128 at Warp.Sociology.Species.CalculateResolutionAndFilter(Single fixedResolution, Action1 progressCallback) in D:\Dev\warp2\WarpLib\Sociology\Species.cs:line 1594 at M.MainWindow.<>c__DisplayClass28_2.<ButtonPopulationAddSpecies_OnClick>b__4() in D:\Dev\warp2\M\MainWindow.xaml.cs:line 484 at System.Threading.Tasks.Task.Execute() --- End of stack trace from previous location where exception was thrown --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at M.MainWindow.<>c__DisplayClass28_1.<<ButtonPopulationAddSpecies_OnClick>b__3>d.MoveNext() in D:\Dev\warp2\M\MainWindow.xaml.cs:line 493 --- End of stack trace from previous location where exception was thrown --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at System.Windows.Threading.ExceptionWrapper.InternalRealCall(Delegate callback, Object args, Int32 numArgs) at System.Windows.Threading.ExceptionWrapper.TryCatchWhen(Object source, Delegate callback, Object args, Int32 numArgs, Delegate catchHandler) at TensorFlow.TFStatus.CheckMaybeRaise(TFStatus incomingStatus, Boolean last) in D:\Dev\warp2\WarpLib\TensorFlowSharp\Tensorflow.cs:line 351 at TensorFlow.TFSession.Run(TFOutput[] inputs, TFTensor[] inputValues, TFOutput[] outputs, TFOperation[] targetOpers, TFBuffer runMetadata, TFBuffer runOptions, TFStatus status) in D:\Dev\warp2\WarpLib\TensorFlowSharp\Tensorflow.cs:line 2681 at TensorFlow.TFSession.Runner.Run(TFStatus status) in D:\Dev\warp2\WarpLib\TensorFlowSharp\Tensorflow.cs:line 2592 at Warp.NoiseNet3D.Train(Single[] source, Single[] target, Single learningRate, Int32 threadID, Single[]& prediction, Single[]& loss) in D:\Dev\warp2\WarpLib\NoiseNet3D.cs:line 182 at Warp.NoiseNet3D..ctor(String modelDir, int3 boxDimensions, Int32 nThreads, Int32 batchSize, Boolean forTraining, Int32 deviceID) in D:\Dev\warp2\WarpLib\NoiseNet3D.cs:line 128 at Warp.Sociology.Species.CalculateResolutionAndFilter(Single fixedResolution, Action1 progressCallback) in D:\Dev\warp2\WarpLib\Sociology\Species.cs:line 1594 at M.MainWindow.<>cDisplayClass28_2.b4() in D:\Dev\warp2\M\MainWindow.xaml.cs:line 484 at System.Threading.Tasks.Task.Execute() --- End of stack trace from previous location where exception was thrown --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at M.MainWindow.<>cDisplayClass28_1.<b3>d.MoveNext() in D:\Dev\warp2\M\MainWindow.xaml.cs:line 493 --- End of stack trace from previous location where exception was thrown --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at System.Windows.Threading.ExceptionWrapper.InternalRealCall(Delegate callback, Object args, Int32 numArgs) at System.Windows.Threading.ExceptionWrapper.TryCatchWhen(Object source, Delegate callback, Object args, Int32 numArgs, Delegate catchHandler)

Any insights about what might've caused this & possible solutions to fix would be very much appreciated.

Thanks.