Closed wyli closed 4 years ago
We would something a bit more generalized with optional outputs for end of iteration, end of step, end of train, etc. We ought to be logging to the engine's log object instead of printing to stdout, we can add and printing handler to that log object to do both of course.
Hi @ericspod ,
Thanks for your suggestion about general StatsLogger. Intead of engine.log, can we put all the useful outputs to engine.state.metrics? We can also get "iteration", "epoch", "max_epochs", "epoch_length", etc. from engine.state. I think engine.state is designed to be an unified API to store useful information for all kinds of event-handlers. If we aligned on this direction, I can try to make a PR based on engine.state. Thanks.
Sorry for jumping in into your conversation, I just would like to help with that and make more clear on what is provided out-of-the-box in ignite for that:
In my experience, using experiment tracking systems like MLflow or Polyaxon, we can either log to the system via their api (and ignite's wrappers like MLflowLogger
or PolyaxonLogger
), write events to TensorBoard or simply print values to stdout and this is automatically written to a log file. The first and the second approaches are obviously more interesting if we would like to compare different runs etc.
HTH
Hi @vfdev-5 ,
Thanks very much for your detailed sharing! I will take a deep dive into your examples.
And @wyli @ericspod @yanchengnv ,
About the usage of Ignite, have we aligned to use only Ignite official code or both official code and the 3rd contrib code? Thanks.
@vfdev-5 @Nic-Ma I've used the log file for logging just messages and such, the SessionSaver class in ptproto creates a new directory in a given parent directory for every new run and sends the log to a file there along with the checkpoints and saved networks. My subclasses of Engine add extra fields to the state and we could add more things to it, I would think that metrics should only be the output from metric handlers and shouldn't have anything else. Returning to the idea of the session handling if we're saving the whole engine state (or everything without large tensors) then these other things we add will get saved as well.
@vfdev-5 One thing to mention is that tqdm doesn't play well with Jupyterlab for some reason, I believe it's a known bug. I had written a super primitive text progress bar that works, I don't know if we collectively want to investigate anything else. I really like doing things through Jupyter a lot, so stuff that doesn't rely on tensorboard/visdom is what I would prefer.
There is a tqdm_notebook
that works quite well: https://pypi.org/project/tqdm/#ipython-jupyter-integration
@fepegar I think that does have issues with Jupyterlab, Jupyter notebook vanilla I think is fine. I don't why but they're different.
Yes, I've had trouble before on JupyterLab. But I think installing the widgets extension solves it: https://ipywidgets.readthedocs.io/en/latest/user_install.html#installing-the-jupyterlab-extension
@ericspod I'm also using jupyterlab for development, it provides a cool environment for research/prototyping/testing etc.
so stuff that doesn't rely on tensorboard/visdom is what I would prefer.
However, how do you plan to run and then organize and compare various trainings for the same task ?
@fepegar I thought I had tried that and it didn't fix the issue, maybe it didn't load correct for me? I'll try again.
@vfdev-5 That is something that I wasn't doing in a great way so definitely we should be targeting ways of supporting lab and tensorboard/visdom.
Yan - Setup time to discuss design choice here.
Hi @yanchengnv and @wyli ,
I found a similar example in @ericspod 's notebook example: @trainer.on(Events.EPOCH_COMPLETED) def log_training_loss(engine): print("Epoch", engine.state.epoch, "Loss:", engine.state.output)
About this task, do you mean something like it? Thanks.