Open s-waitz opened 3 years ago
I have encountered similar issue.
`Downloading: 100%|███████████████████████████| 28.0/28.0 [00:00<00:00, 21.7kB/s] Downloading: 100%|██████████████████████████████| 483/483 [00:00<00:00, 401kB/s] Downloading: 100%|███████████████████████████| 226k/226k [00:00<00:00, 38.4MB/s] Downloading: 100%|███████████████████████████| 455k/455k [00:00<00:00, 40.8MB/s] Downloading: 100%|███████████████████████████| 256M/256M [00:05<00:00, 49.4MB/s] Some weights of the model checkpoint at distilbert-base-uncased were not used when initializing DistilBertModel: ['vocab_layer_norm.weight', 'vocab_transform.bias', 'vocab_projector.bias', 'vocab_layer_norm.bias', 'vocab_projector.weight', 'vocab_transform.weight']
Defaults for this optimization level are:
enabled : True
opt_level : O2
cast_model_type : torch.float16
patch_torch_functions : False
keep_batchnorm_fp32 : True
master_weights : True
loss_scale : dynamic
Processing user overrides (additional kwargs that are not None)...
After processing overrides, optimization options are:
enabled : True
opt_level : O2
cast_model_type : torch.float16
patch_torch_functions : False
keep_batchnorm_fp32 : True
master_weights : True
loss_scale : dynamic
Warning: multi_tensor_applier fused unscale kernel is unavailable, possibly because apex was installed without --cuda_ext --cpp_ext. Using Python fallback. Original ImportError was: ModuleNotFoundError("No module named 'amp_C'",)
/home/ec2-user/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/apex/amp/_initialize.py:25: UserWarning: An input tensor was not cuda.
warnings.warn("An input tensor was not cuda.")
Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 32768.0
/home/ec2-user/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/optim/lr_scheduler.py:134: UserWarning: Detected call of lr_scheduler.step()
before optimizer.step()
. In PyTorch 1.1.0 and later, you should call them in the opposite order: optimizer.step()
before lr_scheduler.step()
. Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
"https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate", UserWarning)
step: 0, loss: 0.7943093180656433
Traceback (most recent call last):
File "train_ditto.py", line 92, in
Hi,
in your readme it says that the --summarize flag needs to be specified for matcher.py if it was also specified at training time. When I do so I get the following error:
Without the --summarize flag matcher.py is running fine.
Is there any workaround to use matcher.py with summarization?