Closed canhe173 closed 1 year ago
Hi @canhe173, thanks for reporting the issue.
I was able to reproduce your issue and it has something to do with the release of wandb=0.13.5. I will fix this asap but for now manually reverting to wandb<=0.13.4
should do the trick, you can do this by running pip install wandb==0.13.4
in your conda environment.
Feel free to let me know if this has fixed the issue!
wandb bug issue -> https://github.com/wandb/wandb/issues/4500
fixed
Hi, guys, when I ran the code
bash test/integration_test.sh
in my computer, I encountered the following problem:File "/home/nkjx/cloth-fold/keypoints/keypoint-detection/keypoint_detection/train/train.py", line 115, in <module> main(hparams) File "/home/nkjx/cloth-fold/keypoints/keypoint-detection/keypoint_detection/train/train.py", line 60, in main trainer.fit(model, data_module) File "/home/nkjx/miniconda3/envs/keypoint-detection/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 579, in fit call._call_and_handle_interrupt( File "/home/nkjx/miniconda3/envs/keypoint-detection/lib/python3.9/site-packages/pytorch_lightning/trainer/call.py", line 38, in _call_and_handle_interrupt return trainer_fn(*args, **kwargs) File "/home/nkjx/miniconda3/envs/keypoint-detection/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 621, in _fit_impl self._run(model, ckpt_path=self.ckpt_path) File "/home/nkjx/miniconda3/envs/keypoint-detection/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1058, in _run results = self._run_stage() File "/home/nkjx/miniconda3/envs/keypoint-detection/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1137, in _run_stage self._run_train() File "/home/nkjx/miniconda3/envs/keypoint-detection/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1160, in _run_train self.fit_loop.run() File "/home/nkjx/miniconda3/envs/keypoint-detection/lib/python3.9/site-packages/pytorch_lightning/loops/loop.py", line 200, in run self.on_advance_end() File "/home/nkjx/miniconda3/envs/keypoint-detection/lib/python3.9/site-packages/pytorch_lightning/loops/fit_loop.py", line 295, in on_advance_end self.trainer._call_callback_hooks("on_train_epoch_end") File "/home/nkjx/miniconda3/envs/keypoint-detection/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1340, in _call_callback_hooks fn(self, self.lightning_module, *args, **kwargs) File "/home/nkjx/miniconda3/envs/keypoint-detection/lib/python3.9/site-packages/pytorch_lightning/callbacks/model_checkpoint.py", line 312, in on_train_epoch_end self._save_topk_checkpoint(trainer, monitor_candidates) File "/home/nkjx/miniconda3/envs/keypoint-detection/lib/python3.9/site-packages/pytorch_lightning/callbacks/model_checkpoint.py", line 369, in _save_topk_checkpoint self._save_monitor_checkpoint(trainer, monitor_candidates) File "/home/nkjx/miniconda3/envs/keypoint-detection/lib/python3.9/site-packages/pytorch_lightning/callbacks/model_checkpoint.py", line 652, in _save_monitor_checkpoint self._update_best_and_save(current, trainer, monitor_candidates) File "/home/nkjx/miniconda3/envs/keypoint-detection/lib/python3.9/site-packages/pytorch_lightning/callbacks/model_checkpoint.py", line 703, in _update_best_and_save self._save_checkpoint(trainer, filepath) File "/home/nkjx/miniconda3/envs/keypoint-detection/lib/python3.9/site-packages/pytorch_lightning/callbacks/model_checkpoint.py", line 381, in _save_checkpoint logger.after_save_checkpoint(proxy(self)) File "/home/nkjx/miniconda3/envs/keypoint-detection/lib/python3.9/site-packages/pytorch_lightning/loggers/wandb.py", line 515, in after_save_checkpoint self._scan_and_log_checkpoints(checkpoint_callback) File "/home/nkjx/miniconda3/envs/keypoint-detection/lib/python3.9/site-packages/pytorch_lightning/loggers/wandb.py", line 594, in _scan_and_log_checkpoints artifact = wandb.Artifact(name=f"model-{self.experiment.id}", type="model", metadata=metadata) File "/home/nkjx/miniconda3/envs/keypoint-detection/lib/python3.9/site-packages/wandb/sdk/wandb_artifacts.py", line 164, in __init__ metadata = _normalize_metadata(metadata) File "/home/nkjx/miniconda3/envs/keypoint-detection/lib/python3.9/site-packages/wandb/sdk/wandb_artifacts.py", line 98, in _normalize_metadata Dict[str, Any], json.loads(json.dumps(util.make_safe_for_json(metadata))) File "/home/nkjx/miniconda3/envs/keypoint-detection/lib/python3.9/json/__init__.py", line 231, in dumps return _default_encoder.encode(obj) File "/home/nkjx/miniconda3/envs/keypoint-detection/lib/python3.9/json/encoder.py", line 199, in encode chunks = self.iterencode(o, _one_shot=True) File "/home/nkjx/miniconda3/envs/keypoint-detection/lib/python3.9/json/encoder.py", line 257, in iterencode return _iterencode(o, 0) File "/home/nkjx/miniconda3/envs/keypoint-detection/lib/python3.9/json/encoder.py", line 179, in default raise TypeError(f'Object of type {o.__class__.__name__} ' TypeError: Object of type Tensor is not JSON serializable