awslabs / mxboard

Logging MXNet data for visualization in TensorBoard.
Apache License 2.0
325 stars 47 forks source link

Configuring logging output #10

Closed fhieber closed 6 years ago

fhieber commented 6 years ago

Is it possible to configure the logger of mxboard? If one uses frequent statements as such

with SummarWriter(logdir=.) as sw:
  sw.add_scalar()

the log of the application becomes quite cluttered with outputs like:

[INFO:root] Successfully opened events file: events.out.tfevents.1522834055.186590d665ad.ant.amazon.com
[INFO:root] Wrote 1 event to disk
[INFO:root] Wrote 1 event to disk
[INFO:root] Wrote 6 events to disk
reminisce commented 6 years ago

@fhieber Please correct me if I don't understand your question correctly. If you are logging a series of scalar values and don't want to use the with SummaryWriter(...) statement, you can define as such sw = SummaryWriter(logdir) and use sw to log values through add_scalar(tag, value, global_step). You also need to manually call sw.close() in the end. It's not recommended to use the with statement frequently since it has the overhead of initializing all the I/O operations, and the later call of with SummaryWriter may overwrite the record of the previous with Statement for some cases. The with statement is useful when you can wrap your code in its scope and it frees you from always remembering calling sw.close() manually. One analogy would be the open() function in Python.

One example of not using the with statement is https://github.com/awslabs/mxboard/blob/master/examples/mnist/train_mnist_mxboard.py#L101

fhieber commented 6 years ago

Thanks @reminisce, it's good to know that there is some I/O overhead with repeatedly constructing SummarWriters. Assuming that we would use a single SummarWriter instance during training, and flush it after every write, is there a risk of data loss if the training processes crashes at some point? Or would you recommend using a try-final block or ExitStack context to ensure closing of the SummaryWriter?

My original question was more around the logging statements done by mxboard. I am not very familiar with Python logging in general, but is there a way to disable the logging statements shown above?

reminisce commented 6 years ago

@fhieber Regarding closing the summary writer gracefully. Here is the analysis of several situations.

  1. If you flush every time after calling add_xxx, there is almost no risk of data loss. However, this is not the recommended way of using the logger since flush would actually block the main thread till writing to the file finishes, while add_xxx only puts the item into the queue for logging, which is much faster than directly writing to files, and a separate thread would do the real logging work.
  2. If some hard crash happens in the middle of training job without throwing any exceptions, for example, a segmentation fault caused by C++ code (probably caused by improper implementation), items in the queue for logging will be lost.
  3. If some crash happens and raises an exception in Python, you would either need to wrap your code in the scope of with SummaryWriter or use a try...except: sw.close() block to prevent data loss. In the former case, you don't need to call sw.close() explicitly as the with statement would do that for you.
reminisce commented 6 years ago

@fhieber Regarding suppressing the logging messages, I have added a parameter to SummaryWriter. You can set verbose=False to disable those logging messages. Let me know if it doesn't work. Thanks. https://github.com/awslabs/mxboard/blob/master/python/mxboard/writer.py#L202

fhieber commented 6 years ago

Works, thanks!