energy-based-model / Compositional-Visual-Generation-with-Composable-Diffusion-Models-PyTorch

[ECCV 2022] Compositional Generation using Diffusion Models
https://energy-based-model.github.io/Compositional-Visual-Generation-with-Composable-Diffusion-Models/
Other
456 stars 41 forks source link

Error on Windows #17

Closed CHENSU12138 closed 2 years ago

CHENSU12138 commented 2 years ago

Hi, thanks for the great work. I am training a model on CLEVR Relation. My OS is Windows. The first error I encountered was that 'RuntimeError: Distributed package doesn't have NCCL built in '. This error occurred after changing NCLL to a runnable GLOO on Windows--

 File "scripts/image_train.py", line 99, in <module>
    main()
  File "scripts/image_train.py", line 54, in main
    TrainLoop(
  File "c:\compositional-visual-generation-with-composable-diffusion-models-pytorch-main\composable_diffusion\train_util.py", line 81, in __init__
    self._load_and_sync_parameters()
  File "c:\compositional-visual-generation-with-composable-diffusion-models-pytorch-main\composable_diffusion\train_util.py", line 129, in _load_and_sync_parameters
    dist_util.sync_params(self.model.parameters())
  File "c:\compositional-visual-generation-with-composable-diffusion-models-pytorch-main\composable_diffusion\dist_util.py", line 168, in sync_params
    dist.broadcast(p, 0)
  File "E:\Anaconda3\envs\compose_diff\lib\site-packages\torch\distributed\distributed_c10d.py", line 1408, in broadcast
    work.wait()
RuntimeError: a leaf Variable that requires grad is being used in an in-place operation.

Is it possible to train with GPU on Windows?

Thank you very much!

nanlliu commented 2 years ago

unfortunately, im not able to help in that regard since I don't have any experience of training models on Windows.

Maybe potentially try to look up online since probably some other people also have the same issue.

CHENSU12138 commented 2 years ago

I got it, thank you very much!