Closed rsn870 closed 1 year ago
@rsn870 thanks so much identifying these great improvements to DeepSpeed. Yes, our current tensorboard support is very limited, but we are interested in improvements that users will find useful. Can you share more description of your model?
So my model is basically used for style transfer purposes. Most style transfer sceanrios deal with multiple losses and it is quite necessary to have good visualisations of the landscape of each individual loss.
A generic tensorboard utility integrated into deepspeed would save some effort.
I would also love to see some utilities that can finetune more complex optimization scenarios involving mutliple models and losses at the same time.
A simple first step might be to give users an access point to our SummaryWriter
object in order to allow for arbitrary logging from client codes? We could also provide a simple log_scalar()
interface so folks can do some simple logging without diving into TensorBoard.
I can see some confusion coming from the asymmetry of the SummaryWriter
only being present on global rank 0.
Hi @rsn870,
Since the logging capabilities were expanded in GH-2013 to support TensorBoard, WandB, and CSV logging formats (see documentation here), I'll close the issue for now.
Feel free to open another issue if there are additional requests for expanded logging capabilties.
Thanks, Lev
Hi ,
The tensorboard config present by default in Deepspeed only provides logging for train/loss and train/lr . This feature however is quite limited , for eg in my training : inputs = data.to(model_engine.device) mean_style = mean_style.to(model_engine.device) latent_rnd = torch.randn(micro_batch_size, 512).to(model_engine.device) losses , encode_lst = model_engine(inputs,mean_style,latent_rnd,mode='train') total = torch.sum(torch.stack(losses))
It is total which I pass to backward as loss thus I would get graphs only for this . However I am interested in viewing the graphs for each individual component of losses list as well to finetune optimization if necessary , this however requires me to use an external tensorboard.
Instead of using the default config then can a utility be made to insert a 'customizable' tensorboard if possible ?