您好，请问下如何对训练好的模型进行单卡预测？

LiheYoung / UniMatch

[CVPR 2023] Revisiting Weak-to-Strong Consistency in Semi-Supervised Semantic Segmentation

https://arxiv.org/abs/2208.09910

MIT License

478 stars 60 forks source link

您好，请问下如何对训练好的模型进行单卡预测？ #30

Closed xiaoqiang-lu closed 1 year ago

xiaoqiang-lu commented 1 year ago

您好，作者。我想利用训练好的模型制作预测结果，按照训练函数中的分布式模型构建方式，可以成功进行多卡预测。但在进行非分布式构建模型及预测时，我遇到了下述问题：

构建模型的代码如下：

LiheYoung commented 1 year ago

你好，需要仿照unimatch.py里对model用DDP wrap一下再load_state_dict:

model = DeepLabV3Plus(cfg)
model.cuda()
model = torch.nn.parallel.DistributedDataParallel(model, device_ids=[local_rank], broadcast_buffers=False,
                                                  output_device=local_rank, find_unused_parameters=False)
model.load_state_dict(checkpoint['model'])

xiaoqiang-lu commented 1 year ago

您好，出现了一个新的问题：

似乎是单卡加载分布式训练模型时，所存的模型需要为：model.module.state_dict() 我注意到最初版本的代码里存模型时用的是model.module.state_dict()，当前版本用的是model.state_dict()

LiheYoung commented 1 year ago

如果是以model.state_dict()直接保存的（state_dict的keys里包含"module"），那需要DDP wrap一下再load；如果是以model.module.state_dict()保存的（state_dict的keys里不包含"module"），那可以直接load。上面这个报错是你没有配置DDP，可以仿照unimatch.py里设置一下: https://github.com/LiheYoung/UniMatch/blob/bb3af6c07f1371a8e69e24e31db4f1389427680e/unimatch.py#L40

xiaoqiang-lu commented 1 year ago

您好，添加上述之后：

出现错误：

LiheYoung commented 1 year ago

需要加上这个： https://github.com/LiheYoung/UniMatch/blob/bb3af6c07f1371a8e69e24e31db4f1389427680e/unimatch.py#L28 另外注意启动方式也使用train.sh里的

xiaoqiang-lu commented 1 year ago

感谢您的解答，这似乎绕回到分布式构建模型，但是指定了单卡。我将torch.nn.parallel.DistributedDataParallel替换为torch.nn.DataParallel解决了此问题，能够直接启动。

谢谢您的耐心指点，祝科研顺利~

LiheYoung commented 1 year ago

好的～