BaguaSys / bagua

Bagua Speeds up PyTorch
https://tutorials-8ro.pages.dev/
MIT License
875 stars 83 forks source link

What's wrong with this? Do I need to do anything else? It will affect my result? #336

Closed lixiangMindSpore closed 3 years ago

lixiangMindSpore commented 3 years ago

Describe the bug A clear and concise description of what the bug is. image

Environment

Reproducing

Please provide a minimal working example. This means the runnable code.

Please also write what exact commands are required to reproduce your results.

Additional context Add any other context about the problem here.

NOBLES5E commented 3 years ago

The message means the memory layout of your PyTorch tensor is inconsistent with latest PyTorch, so Bagua will fallback to a less efficient way to get a tensor's memory address. It will not affect your result. You can safely ignore it if your training runs fine.

lixiangMindSpore commented 3 years ago

The message means the memory layout of your PyTorch tensor is inconsistent with latest PyTorch, so Bagua will fallback to a less efficient way to get a tensor's memory address. It will not affect your result. You can safely ignore it if your training runs fine.

It will affect my velocity?

NOBLES5E commented 3 years ago

Only about 0.5%-2% difference in training speed in our tests. We actually plan to remove this warning in next release.

lixiangMindSpore commented 3 years ago

Only about 0.5%-2% difference in training speed in our tests. We actually plan to remove this warning in next release.

Now I intend to remove this warning. How can I do?

NOBLES5E commented 3 years ago

Try to launch your program with environment variable LOG_LEVEL=error

Like this

export LOG_LEVEL=error
python -m bagua.distributed.launch ....