By any chance do you have any idea why I received following error during training?
I'm running the docker file on some RTX 2080 ti and the last version of CUDA.
Thank you.
process_0 - Initializing MultitaskQuestionAnsweringNetwork
process_0 - MultitaskQuestionAnsweringNetwork has 14,469,902 trainable parameters
Traceback (most recent call last):
File "/decaNLP/train.py", line 374, in
main()
File "/decaNLP/train.py", line 370, in main
run(args, run_args, world_size=args.world_size)
File "/decaNLP/train.py", line 299, in run
model = init_model(args, field, logger, world_size, device)
File "/decaNLP/train.py", line 327, in init_model
model.to(device)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 379, in to
return self._apply(convert)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 185, in _apply
module._apply(fn)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 185, in _apply
module._apply(fn)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/rnn.py", line 112, in _apply
self.flatten_parameters()
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/rnn.py", line 105, in flatten_parameters
self.batch_first, bool(self.bidirectional))
RuntimeError: CuDNN error: CUDNN_STATUS_SUCCESS
By any chance do you have any idea why I received following error during training? I'm running the docker file on some RTX 2080 ti and the last version of CUDA.
Thank you.