Closed djstrong closed 4 years ago
Describe the bug
Bug while training sequence tagger with amp_opt_level=O2.
amp_opt_level=O2
To Reproduce
Version: 0.4.5 GPU: V100
Run https://github.com/flairNLP/flair/blob/master/train.py with added:
use_amp=True, amp_opt_level='O2'
2020-02-08 12:14:02,758 ---------------------------------------------------------------------------------------------------- 2020-02-08 12:14:02,758 Corpus: "Corpus: 12543 train + 2002 dev + 2077 test sentences" 2020-02-08 12:14:02,758 ---------------------------------------------------------------------------------------------------- 2020-02-08 12:14:02,759 Parameters: 2020-02-08 12:14:02,759 - learning_rate: "0.1" 2020-02-08 12:14:02,759 - mini_batch_size: "32" 2020-02-08 12:14:02,759 - patience: "3" 2020-02-08 12:14:02,759 - anneal_factor: "0.5" 2020-02-08 12:14:02,759 - max_epochs: "20" 2020-02-08 12:14:02,759 - shuffle: "False" 2020-02-08 12:14:02,759 - train_with_dev: "False" 2020-02-08 12:14:02,759 - batch_growth_annealing: "False" 2020-02-08 12:14:02,759 ---------------------------------------------------------------------------------------------------- 2020-02-08 12:14:02,759 Model training base path: "resources/taggers/example-ner" 2020-02-08 12:14:02,759 ---------------------------------------------------------------------------------------------------- 2020-02-08 12:14:02,760 Device: cuda:0 2020-02-08 12:14:02,760 ---------------------------------------------------------------------------------------------------- 2020-02-08 12:14:02,760 Embeddings storage mode: cpu Selected optimization level O2: FP16 training with FP32 batchnorm and FP32 master weights. Defaults for this optimization level are: enabled : True opt_level : O2 cast_model_type : torch.float16 patch_torch_functions : False keep_batchnorm_fp32 : True master_weights : True loss_scale : dynamic Processing user overrides (additional kwargs that are not None)... After processing overrides, optimization options are: enabled : True opt_level : O2 cast_model_type : torch.float16 patch_torch_functions : False keep_batchnorm_fp32 : True master_weights : True loss_scale : dynamic 2020-02-08 12:14:02,764 ---------------------------------------------------------------------------------------------------- Traceback (most recent call last): File "<stdin>", line 8, in <module> File "venv/lib/python3.7/site-packages/flair/trainers/trainer.py", line 331, in train loss = self.model.forward_loss(batch_step) File "venv/lib/python3.7/site-packages/flair/models/sequence_tagger_model.py", line 493, in forward_loss features = self.forward(data_points) File "venv/lib/python3.7/site-packages/apex/amp/_initialize.py", line 197, in new_fwd **applier(kwargs, input_caster)) File "venv/lib/python3.7/site-packages/flair/models/sequence_tagger_model.py", line 498, in forward self.embeddings.embed(sentences) File "venv/lib/python3.7/site-packages/flair/embeddings.py", line 177, in embed embedding.embed(sentences) File "venv/lib/python3.7/site-packages/flair/embeddings.py", line 87, in embed for token in sentence.tokens: AttributeError: 'NoneType' object has no attribute 'tokens'
Additional context The error is shown with embeddings storage mode cpu and gpu too. It works with O1 (however training is a little slower).
O1
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
same problem here
Describe the bug
Bug while training sequence tagger with
amp_opt_level=O2
.To Reproduce
Version: 0.4.5 GPU: V100
Run https://github.com/flairNLP/flair/blob/master/train.py with added:
Additional context The error is shown with embeddings storage mode cpu and gpu too. It works with
O1
(however training is a little slower).