Closed zozni closed 2 years ago
Hi, which scikit-learn version are you using?
The scikit-learn version was 0.21.3. After reinstalling the environment, the above issue was resolved. thanks for support.
However, another problem arises: what's the reason?
home/jhj/anaconda3/envs/jerex/lib/python3.7/site-packages/sklearn/utils/validation.py:179: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead.0 [00:45<03:03, 1.16it/s] if LooseVersion(joblib_version) < '0.12': Epoch 0: 94%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎ | 3098/3308 [04:58<00:20, 10.38it/s, loss=0.445, v_num=0_0]Traceback (most recent call last):██████████████████████████████▏ | 89/300 [00:45<02:46, 1.27it/s] File "/home/jhj/anaconda3/envs/jerex/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 637, in run_train self.train_loop.run_training_epoch() File "/home/jhj/anaconda3/envs/jerex/lib/python3.7/site-packages/pytorch_lightning/trainer/training_loop.py", line 577, in run_training_epoch self.trainer.run_evaluation(on_epoch=True) File "/home/jhj/anaconda3/envs/jerex/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 725, in run_evaluation output = self.evaluation_loop.evaluation_step(batch, batch_idx, dataloader_idx) File "/home/jhj/anaconda3/envs/jerex/lib/python3.7/site-packages/pytorch_lightning/trainer/evaluation_loop.py", line 166, in evaluation_step output = self.trainer.accelerator.validation_step(args) File "/home/jhj/anaconda3/envs/jerex/lib/python3.7/site-packages/pytorch_lightning/accelerators/accelerator.py", line 177, in validation_step return self.training_type_plugin.validation_step(args) File "/home/jhj/anaconda3/envs/jerex/lib/python3.7/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 131, in validation_step return self.lightning_module.validation_step(args, kwargs) File "/home/jhj/JEREX/jerex/model.py", line 126, in validation_step return self._inference(batch, batch_idx) File "/home/jhj/JEREX/jerex/model.py", line 176, in _inference output = self(batch, inference=True) File "/home/jhj/anaconda3/envs/jerex/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, kwargs) File "/home/jhj/JEREX/jerex/model.py", line 106, in forward max_rel_pairs=max_rel_pairs, inference=inference) File "/home/jhj/anaconda3/envs/jerex/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, *kwargs) File "/home/jhj/JEREX/jerex/models/joint_models.py", line 144, in forward return self._forward_inference(args, kwargs) File "/home/jhj/JEREX/jerex/models/joint_models.py", line 209, in _forward_inference mention_sample_masks, max_spans=max_spans, max_coref_pairs=max_coref_pairs) File "/home/jhj/JEREX/jerex/models/joint_models.py", line 81, in _forward_inference_common mention_reprs = self.mention_representation(h, mention_masks, max_spans=max_spans) File "/home/jhj/anaconda3/envs/jerex/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/jhj/JEREX/jerex/models/modules/mention_representation.py", line 20, in forward chunk_mention_reprs = self._forward(chunk_mention_masks, chunk_h) File "/home/jhj/JEREX/jerex/models/modules/mention_representation.py", line 28, in _forward mention_reprs = m + h RuntimeError: CUDA out of memory. Tried to allocate 6.16 GiB (GPU 0; 7.77 GiB total capacity; 1.73 GiB already allocated; 4.80 GiB free; 1.92 GiB reserved in total by PyTorch)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "./jerex_train.py", line 24, in
This was also a scikit-learn version issue.... I upgraded the version to 0.23.2 and it was solved.
Testing: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 700/700 [05:24<00:00, 2.27it/s]Evaluation
--- Entity Mentions ---
Traceback (most recent call last): File "./jerex_test.py", line 20, in test model.test(cfg) File "/home/jhj/jerex/jerex/model.py", line 389, in test trainer.test(model, datamodule=data_module) File "/home/jhj/anaconda3/envs/New_Env/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 910, in test results = self.test_given_model(model, test_dataloaders) File "/home/jhj/anaconda3/envs/New_Env/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 970, in test_given_model results = self.fit(model) File "/home/jhj/anaconda3/envs/New_Env/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 499, in fit self.dispatch() File "/home/jhj/anaconda3/envs/New_Env/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 540, in dispatch self.accelerator.start_testing(self) File "/home/jhj/anaconda3/envs/New_Env/lib/python3.8/site-packages/pytorch_lightning/accelerators/accelerator.py", line 76, in start_testing self.training_type_plugin.start_testing(trainer) File "/home/jhj/anaconda3/envs/New_Env/lib/python3.8/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 118, in start_testing self._results = trainer.run_test() File "/home/jhj/anaconda3/envs/New_Env/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 786, in run_test eval_loopresults, = self.run_evaluation() File "/home/jhj/anaconda3/envs/New_Env/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 741, in run_evaluation deprecated_eval_results = self.evaluation_loop.evaluation_epoch_end() File "/home/jhj/anaconda3/envs/New_Env/lib/python3.8/site-packages/pytorch_lightning/trainer/evaluation_loop.py", line 189, in evaluation_epoch_end deprecated_results = self.run_eval_epoch_end(self.num_dataloaders) File "/home/jhj/anaconda3/envs/New_Env/lib/python3.8/site-packages/pytorch_lightning/trainer/evaluation_loop.py", line 221, in run_eval_epoch_end eval_results = model.test_epoch_end(eval_results) File "/home/jhj/jerex/jerex/model.py", line 155, in test_epoch_end metrics = self._evaluator.compute_metrics(self._eval_test_gt, predictions) File "/home/jhj/jerex/jerex/evaluation/joint_evaluator.py", line 76, in compute_metrics mention_eval = scoring.score(gt_mentions, pred_mentions, print_results=True) File "/home/jhj/jerex/jerex/evaluation/scoring.py", line 55, in score metrics = _compute_metrics(gt_flat, pred_flat, labels, labels_str, print_results) File "/home/jhj/jerex/jerex/evaluation/scoring.py", line 64, in _compute_metrics per_type = prfs(gt_all, pred_all, labels=labels, average=None, zero_division=0) TypeError: precision_recall_fscore_support() got an unexpected keyword argument 'zero_division'
Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace. Testing: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 700/700 [05:24<00:00, 2.16it/s]
Hi. When testing, an error like that occurs and the result value is not saved. Any ideas?
thanks