训练DTU数据的loss问题

Be997398715 commented 2 years ago

感谢作者的开源代码！我在尝试训练DTU数据时（新版训练和legacy训练），总是在patchmatchnet_loss函数中报错，报错信息如下： File "/media/user/Data/patchmatchnet_improve/PatchmatchNet-main/models/net.py", line 341, in patchmatchnet_loss loss = loss + F.smooth_l1_loss(depth[mask[i]], gt_depth, reduction="mean") IndexError: The shape of the mask [2, 1, 84, 80] at index 2 does not match the shape of the indexed tensor [2, 1, 512, 640] at index 2 我查看了depth_patchmatch生成的4阶段结果维度，和mask与depth_gt的维度不一致，这个应该如何解决？

FangjinhuaWang commented 2 years ago

Hi,

I am not maintaining this repo actively. Microsoft is the main contributor now. As I found, train.py creates scaled ground truth. The multi-scale groung truth seems like this: HxW, 1/2Hx1/2W, 1/4Hx1/4W, 1/8Hx1/8W. However, in the loss, depth_patchmatch has resolutions like this in sequence: 1/8Hx1/8W, 1/4Hx1/4W, 1/2Hx1/2W, HxW. So it is inverted. You can modify the function create_stage_images to output ground truth in a correct order. @anmatako, could you have a look?

anmatako commented 2 years ago

I will take a look once I'm able to find some time to get back to this wok. This issue is a bit surprising to me since I have tested quite a bit and not run into this. @Be997398715 have you followed the instructions about converting the DTU dataset to the newer format the PatchMatchNet is now using? Also, can you please provide more info on how you're calling the script and which data are being used?

Jake0124 commented 2 years ago

@Be997398715 你好，我也遇到了相同的问题，请问你解决了吗？

FangjinhuaWang commented 2 years ago

Can you try with what I posted before in this thread: modify the function create_stage_images as:

def create_stage_images(image: torch.Tensor) -> List[torch.Tensor]:
    return [
        F.interpolate(image, scale_factor=0.125, mode="nearest"),
        F.interpolate(image, scale_factor=0.25, mode="nearest"),
        F.interpolate(image, scale_factor=0.5, mode="nearest"),
        image,
    ]

Jake0124 commented 2 years ago

@FangjinhuaWang 你好，这种修改并没有解决问题，问题应该出在数据上；当我使用从你们的链接（https://polybox.ethz.ch/index.php/s/ugDdJQIuZTk4S35）中下载的DTU数据时，是可以正常训练的。

lorelei616 commented 2 years ago

@Jake0124 哈喽，作者提供的链接下载不了了，同学能分享一下预处理后的数据集嘛~~~

FangjinhuaWang commented 2 years ago

@lorelei616 我检查了下，链接应该没问题

IBelieve1234 commented 1 year ago

我也碰到了相同的问题，应该使用作者提供的数据集进行训练https://polybox.ethz.ch/index.php/s/ugDdJQIuZTk4S35。并且该链接不翻完全可以下载。

Ttingyyy commented 6 months ago

请问训练的结果是否正确呢？我跑出来的结果都不太对，而且在测试时结果也存在问题

IBelieve1234 commented 6 months ago

没啥问题的，改了改之后跑通啦!

---原始邮件--- 发件人: @.> 发送时间: 2024年3月30日(周六) 中午1:27 收件人: @.>; 抄送: @.**@.>; 主题: Re: [FangjinhuaWang/PatchmatchNet] 训练DTU数据的loss问题 (Issue #65)

请问训练的结果是否正确呢？我跑出来的结果都不太对，而且在测试时结果也存在问题

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

Ttingyyy commented 6 months ago

没啥问题的，改了改之后跑通啦! … ---原始邮件--- 发件人: @.> 发送时间: 2024年3月30日(周六) 中午1:27 收件人: @.>; 抄送: @.**@.>; 主题: Re: [FangjinhuaWang/PatchmatchNet] 训练DTU数据的loss问题 (Issue #65) 请问训练的结果是否正确呢？我跑出来的结果都不太对，而且在测试时结果也存在问题 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

请问你的环境配置时多少呀？我用的8G显存和cuda11.3、pytorch1.13.0 跑是能跑的，但是测试结果是不对的

IBelieve1234 commented 6 months ago

没啥问题的，改了改之后跑通啦! … ---原始邮件--- 发件人: @.**> 发送时间: 2024年3月30日(周六) 中午1:27 收件人: @.**>; 抄送: @.**@.**>; 主题: Re: [FangjinhuaWang/PatchmatchNet] 训练DTU数据的loss问题 (Issue #65) 请问训练的结果是否正确呢？我跑出来的结果都不太对，而且在测试时结果也存在问题 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

请问你的环境配置时多少呀？我用的8G显存和cuda11.3、pytorch1.13.0 跑是能跑的，但是测试结果是不对的

RTX3090,时不时显存占到22G，我测试结果是正确的

FangjinhuaWang / PatchmatchNet

训练DTU数据的loss问题 #65