Open CompRhys opened 2 years ago
I don't think another flag should be added. Why not move the print out to the Trainer constructor so it's only printed once?
I don't think another flag should be added. Why not move the print out to the Trainer constructor so it's only printed once?
It already is. In linked issue (#13358) in order to reduce uncontrollable verbosity I was advised to create a secondary Trainer. There's no need to persist this trainer in the secondary optimisation loop so it gets deleted by the gc and reinitialised when needed.
In general afaik uncontrollable verbosity isn't ideal and in terms of unhelpful verbosity the number of TPUs, HPUs, and IPUs are less likely to be informative than the model summary which there is an option to suppress.
I'm curious, is there a desire to have verbosity controlled on a more global level, not just the summary here?
I'm curious, is there a desire to have verbosity controlled on a more global level, not just the summary here?
I think that between the enable options for model summary and progress bars (making new trainer when needed to adjust) and the possibility to make things PotentialUserWarnings
you can control pretty much anything apart from this device summary.
A verbose=int
setup c.f. sklearn
could work but would be a much bigger change
Happy for me to change the associated PR to be reviewed?
@CompRhys thanks for the PR. Since this is adding a flag to the core API in Trainer, we need to discuss it with the core team @Lightning-AI/core-lightning and get some more opinions.
I think there are also a few options we haven't explored yet.
logging
- Let the messages be more easily filtered through logging
- Introduce a verbose flag to control messaging through Trainer more generally (e.g. fastdevrun infos)
these 2 seem better sol. I'd prefer 3 if there are more logs we could configure.
I'd prefer a combination of 1. and 2.
IMO it really is not necessary to be able to have that baked into the core trainer (same as model summary was not necessary).
And having it more easily filtered would also be great (I tried to forward the streams to something else to just avoid the prints and that also didn't work).
Today, it can be silenced by doing this:
import logging
def device_info_filter(record):
return "PU available: " not in record.getMessage()
logging.getLogger("pytorch_lightning.utilities.rank_zero").addFilter(device_info_filter)
I find the callback idea (1) a bit overkill.
With (2) we can improve the above, maybe by using a Trainer logger instead of the rank zero logger.
(3) seems like it has a larger scope. It would be interesting to see what are your concrete ideas. But for the device info message in particular, we've always agreed that it should be shown and not just when fast_dev_run=True
This issue has been automatically marked as stale because it hasn't had any recent activity. This issue will be closed in 7 days if no further activity occurs. Thank you for your contributions, PyTorch Lightning Team!
I changed my mind. I think the callback
proposal is the simplest and most extensible option. This would also resolve https://github.com/Lightning-AI/lightning/issues/11014. And we could have flags in the callback to disable specific prints.
I think I can take this up!
Did anything come from this? My initial PR was never reviewed -- https://github.com/Lightning-AI/lightning/pull/13379
~Seems like it made its way upstream? https://lightning.ai/docs/pytorch/stable/common/trainer.html#enable-model-summary~
Apologies, confused it with enable_device_summary
. Would make sense to be in the same place though.
🚀 Feature
Add
enable_model_summary
boolean kwarg topl.Trainer()
to supress_log_device_info()
's output.Motivation
When calling predict within a surrogate model loop Trainer prints out the devices each time breaking apart intended tables etc or other outputs. Related to https://github.com/Lightning-AI/lightning/issues/13358 for cleaning-up/reducing stdout verbosity.
Pitch
Add
enable_model_summary
kwarg toTrainer
that defaults to TrueAlternatives
The suggested solution is the simplest solution, any alternative would add more complexity.
Additional context
None
If you enjoy Lightning, check out our other projects! âš¡
Metrics: Machine learning metrics for distributed, scalable PyTorch applications.
Lite: enables pure PyTorch users to scale their existing code on any kind of device while retaining full control over their own loops and optimization logic.
Flash: The fastest way to get a Lightning baseline! A collection of tasks for fast prototyping, baselining, fine-tuning, and solving problems with deep learning.
Bolts: Pretrained SOTA Deep Learning models, callbacks, and more for research and production with PyTorch Lightning and PyTorch.
Lightning Transformers: Flexible interface for high-performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra.
cc @borda @awaelchli @ananthsub @rohitgr7 @justusschock @kaushikb11