about the training - Githubissues

jaceqin commented 1 year ago

When I running the scripts/segmentation_train.py have a problem.

Traceback (most recent call last): File "D:\jace\pythonProject\Diffusion-based-Segmentation-main\segmentation_train.py", line 86, in main() File "D:\jace\pythonProject\Diffusion-based-Segmentation-main\segmentation_train.py", line 66, in main TrainLoop( File "D:\jace\pythonProject\Diffusion-based-Segmentation-main\guided_diffusion\train_util.py", line 83, in init self._load_and_sync_parameters() File "D:\jace\pythonProject\Diffusion-based-Segmentation-main\guided_diffusion\train_util.py", line 140, in _load_and_sync_parameters dist_util.sync_params(self.model.parameters()) File "D:\jace\pythonProject\Diffusion-based-Segmentation-main\guided_diffusion\dist_util.py", line 77, in sync_params dist.broadcast(p, 0) File "C:\SoftWare\python 3.10\lib\site-packages\torch\distributed\distributed_c10d.py", line 1408, in broadcast work.wait() RuntimeError: a leaf Variable that requires grad is being used in an in-place operation.

lansfair commented 1 year ago

How did you solve the problem？@jaceqin

JuliaWolleb commented 1 year ago

Does the problem still occur when you use python 3.8?

yuan5828225 commented 1 year ago

I commented it out , I don't know if it's OK

JuliaWolleb commented 1 year ago

If the training still works fine, then that should be ok. I guess this could be an issue with the python version.

JuliaWolleb / Diffusion-based-Segmentation

about the training #24