IDEA-Research / detrex

detrex is a research platform for DETR-based object detection, segmentation, pose estimation and other visual recognition tasks.
https://detrex.readthedocs.io/en/latest/
Apache License 2.0
1.92k stars 202 forks source link

DINO traning error when using internimage #251

Open winpih opened 1 year ago

winpih commented 1 year ago

The following error occurs. Please check.

/usr/local/lib/python3.8/dist-packages/torch/autograd/init.py:173: UserWarning: Grad strides do not match bucket view strides. This may indicate grad was not created according to the gradient layout contract, or that the param's strides changed since DDP was constructed. This is not an error, but may impair performance. grad.sizes() = [1280, 1, 3, 3], strides() = [9, 1, 3, 1] bucket_view.sizes() = [1280, 1, 3, 3], strides() = [9, 9, 3, 1] (Triggered internally at ../torch/csrc/distributed/c10d/reducer.cpp:326.) Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass

rentainhe commented 1 year ago

The following error occurs. Please check.

/usr/local/lib/python3.8/dist-packages/torch/autograd/init.py:173: UserWarning: Grad strides do not match bucket view strides. This may indicate grad was not created according to the gradient layout contract, or that the param's strides changed since DDP was constructed. This is not an error, but may impair performance. grad.sizes() = [1280, 1, 3, 3], strides() = [9, 1, 3, 1] bucket_view.sizes() = [1280, 1, 3, 3], strides() = [9, 9, 3, 1] (Triggered internally at ../torch/csrc/distributed/c10d/reducer.cpp:326.) Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass

OK, we will check the bug recently~

winpih commented 1 year ago

The following error occurs. Please check. /usr/local/lib/python3.8/dist-packages/torch/autograd/init.py:173: UserWarning: Grad strides do not match bucket view strides. This may indicate grad was not created according to the gradient layout contract, or that the param's strides changed since DDP was constructed. This is not an error, but may impair performance. grad.sizes() = [1280, 1, 3, 3], strides() = [9, 1, 3, 1] bucket_view.sizes() = [1280, 1, 3, 3], strides() = [9, 9, 3, 1] (Triggered internally at ../torch/csrc/distributed/c10d/reducer.cpp:326.) Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass

OK, we will check the bug recently~

Thank you sir!!