argonne-lcf / dlio_benchmark

An I/O benchmark for deep Learning applications
https://dlio-benchmark.readthedocs.io
Apache License 2.0
70 stars 30 forks source link

Changing logging levels #222

Open zhenghh04 opened 3 months ago

zhenghh04 commented 3 months ago

In this PR, we changed the per step output from info to debug to reduce the logging overhead.

We also add support for changing logging level

hariharan-devarajan commented 3 months ago

@zhenghh04 Can we not print the file name and line number for info logging. We should only do that for debug logging. This will significantly reduce log size.

zhenghh04 commented 3 months ago

@hariharan-devarajan , made the changes as you suggested. Please review it again.

zhenghh04 commented 3 months ago

@hariharan-devarajan , I am hesitating between using workflow.log_level to control the log level, vs using DLIO_LOG_LEVEL environment variable to control it. I slightly lean towards to the latter one, which is more common in other apps. What is your preference?

hariharan-devarajan commented 3 months ago

@hariharan-devarajan , I am hesitating between using workflow.log_level to control the log level, vs using DLIO_LOG_LEVEL environment variable to control it. I slightly lean towards to the latter one, which is more common in other apps. What is your preference?

I see a value in both. But I prefer the environment variable, too. By default, we should have WARN level, not info. Then we can make DLIO_LOG_LEVEL to higher logging levels like INFO and DEBUG.

We should switch the per epoch time to print and include the sample rate per epoch there.

Then, per step becomes info.

And variable logging becomes debug.

zhenghh04 commented 3 months ago

@hariharan-devarajan After reading the documentation more: https://docs.python.org/3/library/logging.html#levels, I feel that we should have default logging level to be info, and everything should go through logging, not through print.

What do you think?

hariharan-devarajan commented 3 months ago

So printing is different than logging in my opinion.

The things that tell us about benchmark high-level like initialize, progress, and metrics.

Logging here is for internal parts.

I wrote a logger in the past on c++ in which I added print as the highest log above error.

For benchmarks I feel it makes sense to have this.

hariharan-devarajan commented 3 months ago

Now if u want to use logging for printing, I would create two loggers. One for printing and one for logging internal stuff.

The printing could always be info, internal, and cannot change by benchmark parameters.