Encounter error in dcnv3

OpenGVLab / InternImage

[CVPR 2023 Highlight] InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions

https://arxiv.org/abs/2211.05778

MIT License

2.54k stars 234 forks source link

Encounter error in dcnv3 #19

Open Youskrpig opened 1 year ago

Youskrpig commented 1 year ago

I try to ensure if model could run and gpu1 is empty，but in forward funcation, error occurs:

Youskrpig commented 1 year ago

I also try model = torchvision.models.resnet50(pretrained=True), in forward function, it outputs normal

czczup commented 1 year ago

This looks a bit strange, as we have not encountered this issue before. Would you be able to provide more information? This would be helpful for us to pinpoint the issue. Thank you!

Youskrpig commented 1 year ago

意思就是训练的前向传播过程一定会占用0号卡显存（我是2张卡，即使指定输入和模型都放在1号卡上（weight和bias我确认过了都在1号卡），但是还是会占用0号卡的一点显存，如果0号卡显存是占满的，那么就会报错），所以想问下是不是cuda编译的时候有使用0号卡的地方？

czczup commented 1 year ago

直接指定CUDA_VISIBLE_DEVICES=1呢