Closed jonaskohler closed 3 years ago
Hi! thanks for your contribution!, great first issue!
First, if you would like to run validation before training in the first, you can use https://pytorch-lightning.readthedocs.io/en/latest/trainer.html#num-sanity-val-steps
Second, setting up log_every_n_steps=100
will only log at every 100 steps
if you would like to log at 0, you can check batch_idx==0
in training/validation_step and log what you want.
will it makes sense to allow logging at global_step == 0?
will it makes sense to allow logging at global_step == 0?
not sure... what does everyone think?
Thanks for your answers. For me (speaking from a research perspective) it is definitely useful to know what's happening right at initialization. For example when examining different initialization schemes.
yeah, i don’t think i mind logging batch 0.
@ananthsub?
Please allow me a follow up question: When passing num_sanity_val_steps=-1 to the trainer, the method validation_epoch_end is called at iteration 0 but for some reason nothing is logged. I tried both self.log(..) as well as logs={...}, return 'log':logs.
As soon as the first epoch is over, everything logs as normal. What do I need to do to trigger the first log to show up on tensorboard? Thanks in advance
Logging at global step 0 sounds good to me. I think it'd simplify the logging internals too from (global step + 1) % log_every_n_steps to just (global step) % log_every_n_steps.
(global step) % log_every_n_steps.
global_step is indexed at 0, so if log_every_n_steps = 10
, it will log at 1, 11, 21 training_step
.
Hi, I've somewhat similar question - Is there any way to log the first training epoch (or several iteration) without doing optimization? I want to get baseline for the initialized model and compare loss values both on training and validation samples.
Thanks in advance, -ea
This issue has been automatically marked as stale because it hasn't had any recent activity. This issue will be closed in 7 days if no further activity occurs. Thank you for your contributions, Pytorch Lightning Team!
Hi all
I have two issues with logging in lightning: First, I would like to run a full validation loop in the beginning to get the validation loss at random initialziation.
Second, I set "log_every_n_steps=100" for the pl.Trainer and would have expected the trainer to log also at initialization 0 but it starts at 100.
Thus, I cannot see how the network behaves at initialization neither on the train nor the validation set. I tried to find a fix in the documentation but didn't succeed.
Any help is highly appreciated :)