0% 0/103 [00:00<?, ?it/s]/usr/local/lib/python3.10/dist-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
warnings.warn(
Traceback (most recent call last):
File "/content/MedicalGPT/pretraining.py", line 735, in
main()
File "/content/MedicalGPT/pretraining.py", line 696, in main
train_result = trainer.train(resume_from_checkpoint=checkpoint)
File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 1591, in train
return inner_training_loop(
File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 1950, in _inner_training_loop
self.accelerator.clip_gradnorm(
File "/usr/local/lib/python3.10/dist-packages/accelerate/accelerator.py", line 2040, in clip_gradnorm
self.unscale_gradients()
File "/usr/local/lib/python3.10/dist-packages/accelerate/accelerator.py", line 2003, in unscalegradients
self.scaler.unscale(opt)
File "/usr/local/lib/python3.10/dist-packages/torch/cuda/amp/gradscaler.py", line 307, in unscale
optimizer_state["found_inf_per_device"] = self._unscalegrads(
File "/usr/local/lib/python3.10/dist-packages/torch/cuda/amp/grad_scaler.py", line 229, in _unscalegrads
raise ValueError("Attempting to unscale FP16 gradients.")
ValueError: Attempting to unscale FP16 gradients.
0% 0/103 [00:02<?, ?it/s]
0% 0/103 [00:00<?, ?it/s]/usr/local/lib/python3.10/dist-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants. warnings.warn( Traceback (most recent call last): File "/content/MedicalGPT/pretraining.py", line 735, in
main()
File "/content/MedicalGPT/pretraining.py", line 696, in main
train_result = trainer.train(resume_from_checkpoint=checkpoint)
File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 1591, in train
return inner_training_loop(
File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 1950, in _inner_training_loop
self.accelerator.clip_gradnorm(
File "/usr/local/lib/python3.10/dist-packages/accelerate/accelerator.py", line 2040, in clip_gradnorm
self.unscale_gradients()
File "/usr/local/lib/python3.10/dist-packages/accelerate/accelerator.py", line 2003, in unscalegradients
self.scaler.unscale(opt)
File "/usr/local/lib/python3.10/dist-packages/torch/cuda/amp/gradscaler.py", line 307, in unscale
optimizer_state["found_inf_per_device"] = self._unscalegrads(
File "/usr/local/lib/python3.10/dist-packages/torch/cuda/amp/grad_scaler.py", line 229, in _unscalegrads
raise ValueError("Attempting to unscale FP16 gradients.")
ValueError: Attempting to unscale FP16 gradients.
0% 0/103 [00:02<?, ?it/s]