How to use tensorboard - Githubissues

mmahdavian commented 1 year ago

Hello

I am using one GPU to train a model and you mentioned to me before that I need to change the args.distributed to False. In another issue you have mentioned it causes the tensorboard not to collect the data. But I need that. My question is if I change the args.distributed from False to True in the loss function updates sections and change the args.world_size to 1, would this solve the problem and could I get the results in the tensorboard while keeping the original structure of the code in correct format?

Thank You

deepcs233 commented 1 year ago

Hi, Sorry, I have not tried that way. Im not not weather it can work. I recommend that you set args.distributed to True and fix the code about the tensorboard. https://github.com/opendilab/InterFuser/blob/main/interfuser/train.py#L1420 and https://github.com/opendilab/InterFuser/blob/main/interfuser/train.py#L1676, you can make up the related code with args.distributed=False. It's very easy, i think only removing the code about reduce_tensor will be ok.

mmahdavian commented 1 year ago

Ok Thank You

opendilab / InterFuser

How to use tensorboard #14