Closed mmahdavian closed 1 year ago
Hi,
Sorry, I have not tried that way. Im not not weather it can work. I recommend that you set args.distributed to True and fix the code about the tensorboard. https://github.com/opendilab/InterFuser/blob/main/interfuser/train.py#L1420 and https://github.com/opendilab/InterFuser/blob/main/interfuser/train.py#L1676, you can make up the related code with args.distributed=False. It's very easy, i think only removing the code about reduce_tensor
will be ok.
Ok Thank You
Hello
I am using one GPU to train a model and you mentioned to me before that I need to change the args.distributed to False. In another issue you have mentioned it causes the tensorboard not to collect the data. But I need that. My question is if I change the args.distributed from False to True in the loss function updates sections and change the args.world_size to 1, would this solve the problem and could I get the results in the tensorboard while keeping the original structure of the code in correct format?
Thank You