bubbliiiing / yolox-pytorch

这是一个yolox-pytorch的源码，可以用于训练自己的模型。

Apache License 2.0

880 stars 183 forks source link

使用分布式训练报错 #124

Open zhn6818 opened 1 year ago

zhn6818 commented 1 year ago

将distributed = True RuntimeError: Output 0 of SliceBackward0 is a view and is being modified inplace. This view is the output of a function that returns multiple views. Such functions do not allow the output views to be modified inplace. You should replace the inplace operation by an out-of-place one. 这个问题如何解决，望指教

bubbliiiing commented 1 year ago

运行指令是？截图看看

zhn6818 commented 1 year ago

CUDA_VISIBLE_DEVICES=4,5,6,7 python -m torch.distributed.launch --nproc_per_node=4 train.py

bubbliiiing commented 1 year ago

有截图吗现有的看不出来，感觉可能是环境问题，0 0，你是什么环境

Jamesgender commented 1 year ago

这个问题主要是版本的问题，然后如果想修改的话，需要将报错行（估计在yolo training105行）更改为：

output[..., :2] = (output[..., :2] + grid.type_as(output)) * stride

    temp_output = output.clone()
    temp_output[..., :2] = (output[..., :2] + grid.type_as(output)) * stride
    output = temp_output
    return output, grid

bubbliiiing commented 1 year ago

为啥捏0 0有理由吗

Jamesgender commented 1 year ago

RuntimeError: Output 0 of SliceBackward0 is a view and is being modified inplace.

https://github.com/SHI-Labs/FcF-Inpainting/issues/26

bubbliiiing commented 1 year ago

6啊baby

bubbliiiing commented 1 year ago

我在想要不要改一下

zhn6818 commented 1 year ago

给大佬鞠躬

RykerYang commented 11 months ago

我也给大佬磕一个