Closed muellerzr closed 1 year ago
@muellerzr, thanks for reporting this error. This is definitely an oversight. Are you able to submit a PR that uses logger instead of print?
@tjruwase sadly don't have the bandwidth/familiarity with the code right now to do so, would appreciate it if someone else could pick it up 🙏
No worries. Thanks for reporting the error. I will link a PR asap.
By the way, are you actually running with tpus? The accelerator abstraction is a fairly new feature to make it easier to run DeepSpeed on arbitrary hardware, and so we appreciate any feedback. Thanks!
@tjruwase nope, not with DeepSpeed yet (@pacman100 on that one with Accelerate?) it's on a GPU runner, just our CLI picks up on it due to import (which was also a great way to know that something weird was happening 😄 )
@muellerzr, could please get the linked PR a try? Thanks!
@tjruwase I have update to 0.94. And There are still too much message "[real_accelerator.py:110:get_accelerator] Setting ds_accelerator to cuda (auto detect)" printed in std io.
@tjruwase let's reopen this please, because this still isn't the right solution, since it's always printed to the user as @vipcxj states
I found no annoying messages printed when I am not distributed training. However when I spawn the processes to train the model with multi gpu, the annoying messages come back. I don’t know what the two are different, because the codes are the same.
spawn(train_and_test, args=(2,), nprocs=2)
vs
train_and_test()
The bug still exists with version 0.10.2
@tjruwase let's reopen this please, because this still isn't the right solution, since it's always printed to the user as @vipcxj states
Sorry, I missed this earlier.
The bug still exists with version
0.10.2
@king159, @vipcxj, can you please share a simple repro?
@tjruwase:
import deepspeed
python myfile.py
@muellerzr, can you please try https://github.com/microsoft/DeepSpeed/tree/olruwase/ds_3680_2? The issue seems to be that our logger is using default log level info. This hack manually uses debug level.
@king159, @vipcxj FYI
@muellerzr, can you please try https://github.com/microsoft/DeepSpeed/tree/olruwase/ds_3680_2? The issue seems to be that our logger is using default log level info. This hack manually uses debug level.
@king159, @vipcxj FYI
This works fine for me
@muellerzr, can you please try https://github.com/microsoft/DeepSpeed/tree/olruwase/ds_3680_2? The issue seems to be that our logger is using default log level info. This hack manually uses debug level.
@king159, @vipcxj FYI
would you plz check where this message come from, i'm using deepspeed by huggingface accelerate, so, it's difficult to find out if this from deepspeed
@Honesty-of-the-Cavernous-Tissue, that message is from DeepSpeed. Should be fixed by #4310.
@Honesty-of-the-Cavernous-Tissue, that message is from DeepSpeed. Should be fixed by #4310.
thanks, it's bother me for a long time
Both fixes quiet everything down for me @tjruwase 🙏
@muellerzr, thanks for the confirmation. I think this behavior needs to be configurable, so I will work a bit more on the PR before merging.
Got it, thanks @tjruwase. Definitely waiting with anticipation.
Describe the bug When
accelerate
tries to run any command whileDeepSpeed
is available, we get"Setting ds_accelerator ..."
, which while not explicitly bad, is making a test fail due to a non-clean CLI print when non-deepspeed code is used (we have a check foris_deepspeed_available()
which is probably triggering this.As a nice QOL, it would be good to have
get_accelerator
(https://github.com/microsoft/DeepSpeed/blob/master/accelerator/real_accelerator.py#L102) andset_accelerator
(https://github.com/microsoft/DeepSpeed/blob/master/accelerator/real_accelerator.py#L109) use the already-existing logger in the framework if possible, so that we can disable these annoying prints when they're not needed :)To Reproduce Steps to reproduce the behavior:
pip install accelerate deepspeed -U
parser = tpu_command_parser() args = parser.parse_args([ "--config_file", "tests/test_configs/latest.yaml", "--install_accelerate", "--debug" ])
tpu_command_launcher(args)
Expected behavior
A configurable option to silence these print statements by having them run through the logging system instead, perhaps as an
info
or as adebug
.