Encounter an error. Same error comes for both pytorch-nightly and pytorch-1.1.0
have re-installed apex and maskrcnn-benchmark many times. Can anyone give a help? Thanks!
Traceback (most recent call last):
File "tools/train_net.py", line 171, in <module>
main()
File "tools/train_net.py", line 164, in main
model = train(cfg, args.local_rank, args.distributed)
File "tools/train_net.py", line 73, in train
arguments,
File "/home/ubuntu/msrcnn/maskscoring_rcnn/maskrcnn_benchmark/engine/trainer.py", line 83, in do_train
with amp.scale_loss(losses, optimizer) as scaled_losses:
File "/home/ubuntu/miniconda3/envs/msrcnn/lib/python3.6/contextlib.py", line 81, in __enter__
return next(self.gen)
File "/home/ubuntu/miniconda3/envs/msrcnn/lib/python3.6/site-packages/apex-0.1-py3.6-linux-x86_64.egg/apex/amp/handle.py", line 81, in scale_loss
raise RuntimeError("Invoked 'with amp.scale_loss`, but internal Amp state has not been initialized. "
RuntimeError: Invoked 'with amp.scale_loss`, but internal Amp state has not been initialized. model, optimizer = amp.initialize(model, optimizer, opt_level=...) must be called before `with amp.scale_loss`.
Environment
Here is my env info. Same error comes for both pytorch-nightly and pytorch-1.1.0
Collecting environment information...
PyTorch version: 1.1.0
Is debug build: No
CUDA used to build PyTorch: 10.0.130
OS: Ubuntu 16.04.6 LTS
GCC version: (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609
CMake version: version 3.13.3
Python version: 3.6
Is CUDA available: Yes
CUDA runtime version: 10.0.130
GPU models and configuration:
GPU 0: Tesla V100-SXM2-16GB
GPU 1: Tesla V100-SXM2-16GB
GPU 2: Tesla V100-SXM2-16GB
GPU 3: Tesla V100-SXM2-16GB
Nvidia driver version: 418.40.04
cuDNN version: Could not collect
Versions of relevant libraries:
[pip3] numpy==1.15.4
[conda] blas 1.0 mkl
[conda] mkl 2019.4 243
[conda] mkl_fft 1.0.12 py36ha843d7b_0
[conda] mkl_random 1.0.2 py36hd81dba3_0
[conda] pytorch 1.1.0 py3.6_cuda10.0.130_cudnn7.5.1_0 pytorch
[conda] torchvision 0.3.0 py36_cu10.0.130_1 pytorch
could you post a (small) reproducible code snippet?
Did you properly initialize the model before calling the loss scaler as suggested in the error message?
Bug
Encounter an error. Same error comes for both pytorch-nightly and pytorch-1.1.0
have re-installed apex and maskrcnn-benchmark many times. Can anyone give a help? Thanks!
Environment
Here is my env info. Same error comes for both pytorch-nightly and pytorch-1.1.0