Some questions about depth_visual_xxx.png.

StriveZs commented 3 years ago

Hi, this is Zander, recently I run your code in my server, I find your code use the depth_visual_xxx.png as the mask to constrain the depth map:

   model_loss = mvsnet_loss
    model.train()
    optimizer.zero_grad()

    sample_cuda = tocuda(sample)
    depth_gt = sample_cuda["depth"]
    mask = sample_cuda["mask"]

    outputs = model(sample_cuda["imgs"], sample_cuda["proj_matrices"], sample_cuda["depth_values"])
    depth_est = outputs["depth"]

    loss = model_loss(depth_est, depth_gt, mask)

In your mvsnet_loss, I see the your the mask to spilt the foreground and background:

def mvsnet_loss(depth_est, depth_gt, mask):
    mask = mask > 0.5
    return F.smooth_l1_loss(depth_est[mask], depth_gt[mask], size_average=True)

But in the Source Code, I don't see any codes about use depth_visual.png, so I wonder know whether you create some new codes to use these masks. Please tell me, I'm hoping for your answer. :-)

StriveZs commented 3 years ago

Sorry for my bad English

XYZ-qiyh commented 3 years ago

@zs670980918 你好，MVSNet网络用于模型训练的深度图gt是由mesh渲染得到的，这使得深度图gt并不是每个像素都有深度值的。在网络模型训练过程中，只有valid_pixels区域预测的深度值才参与loss的计算。 depth_visual.png文件中黑白颜色即代表是否为存在深度值，depth_visual.png文件是通过depth_gt得到的，程序中mask就是通过读取depth_visual.png文件得到的。 https://github.com/xy-guo/MVSNet_pytorch/blob/e0f2ae3d7cb2dd13807b775f2075682eaa7f1521/datasets/dtu_yao.py#L86 https://github.com/xy-guo/MVSNet_pytorch/blob/e0f2ae3d7cb2dd13807b775f2075682eaa7f1521/datasets/dtu_yao.py#L101

StriveZs commented 3 years ago

@zs670980918 你好，MVSNet网络用于模型训练的深度图gt是由mesh渲染得到的，这使得深度图gt并不是每个像素都有深度值的。在网络模型训练过程中，只有valid_pixels区域预测的深度值才参与loss的计算。 depth_visual.png文件中黑白颜色即代表是否为存在深度值，depth_visual.png文件是通过depth_gt得到的，程序中mask就是通过读取depth_visual.png文件得到的。 https://github.com/xy-guo/MVSNet_pytorch/blob/e0f2ae3d7cb2dd13807b775f2075682eaa7f1521/datasets/dtu_yao.py#L86

https://github.com/xy-guo/MVSNet_pytorch/blob/e0f2ae3d7cb2dd13807b775f2075682eaa7f1521/datasets/dtu_yao.py#L101

你好，感谢你的回复，这部分的代码我基本已经看懂了，我最近也结合着原来的tensorflow版本代码进行也阅读，但是我发现好像原tensorflow版本的代码好像在loss计算中没有涉及使用到数据集中的mask，如下: loss.py

def non_zero_mean_absolute_diff(y_true, y_pred, interval):
    """ non zero mean absolute loss for one batch """
    with tf.name_scope('MAE'):
        shape = tf.shape(y_pred)
        interval = tf.reshape(interval, [shape[0]])
        mask_true = tf.cast(tf.not_equal(y_true, 0.0), dtype='float32')
        denom = tf.reduce_sum(mask_true, axis=[1, 2, 3]) + 1e-7
        masked_abs_error = tf.abs(mask_true * (y_true - y_pred))            # 4D
        masked_mae = tf.reduce_sum(masked_abs_error, axis=[1, 2, 3])        # 1D
        masked_mae = tf.reduce_sum((masked_mae / interval) / denom)         # 1
    return masked_mae
def mvsnet_regression_loss(estimated_depth_image, depth_image, depth_interval):
    """ compute loss and accuracy """
    # non zero mean absulote loss
    masked_mae = non_zero_mean_absolute_diff(depth_image, estimated_depth_image, depth_interval)
    # less one accuracy
    less_one_accuracy = less_one_percentage(depth_image, estimated_depth_image, depth_interval)
    # less three accuracy
    less_three_accuracy = less_three_percentage(depth_image, estimated_depth_image, depth_interval)

    return masked_mae, less_one_accuracy, less_three_accuracy

train.py

# regression loss
loss0, less_one_temp, less_three_temp = mvsnet_regression_loss(
    depth_map, depth_image, depth_interval)
loss1, less_one_accuracy, less_three_accuracy = mvsnet_regression_loss(
    refined_depth_map, depth_image, depth_interval)
loss = (loss0 + loss1) / 2

根据上面的代码，我好想感觉它是根据深度间隔和GT深度计算得到的maksed_mae。重新描述一下我的问题：我看pytorch版本的代码直接使用到了数据集中mask，我想确认原始代码是否也用到了，因为根据我看了一下tensorflow版本的代码，好像不是直接用mask的。我想知道这个mask的使用是pytorch的作者自己创新写上去吗？还是我自己太菜没有看懂原代码。如果我说的哪里有问题希望能够理解。十分感谢

xy-guo / MVSNet_pytorch

Some questions about depth_visual_xxx.png. #25