lzhengning / SubdivNet

Subdivision-based Mesh Convolutional Networks.
MIT License
251 stars 34 forks source link

Questions about your results? #22

Closed lidan233 closed 2 years ago

lidan233 commented 2 years ago

The results of PD-MeshNet in your paper are very different from the results in the original paper. I also notice your answer to question 15. But the metric of PD-MeshNet is on the primal nodes (on faces), which means that the metric is strict. Therefore, I'm curious about the reason. Thanks! Have a nice day!

lzhengning commented 2 years ago

Hi, @lidan233

Many have questions on the inconsistent performance results of MeshCNN / PD-MeshNet. The brief answer is that MeshCNN / PD-MeshNet / SubdivNet are evaluated with different metrics. In our paper, we use a unified metric, i.e., per-face accuracy on the raw meshes.

More detailed discussion is described in the first paragraph of the segmentation experiment in our paper. Feel free to ask if you have more questions.

lidan233 commented 2 years ago

Give me one day! Thanks!

lidan233 commented 2 years ago

I read your paper carefully. Due to my poor foundation, I have the following problems or misunderstandings.

Your method may have the same metric as PD-MeshNet in segmentation? In your code:

def train(net, optim, dataset, writer, epoch):
    net.train()
    acc = 0
    for meshes, labels, _ in tqdm(dataset, desc=str(epoch)):
        mesh_tensor = to_mesh_tensor(meshes)
        mesh_labels = jt.int32(labels)
        outputs = net(mesh_tensor)
        loss = nn.cross_entropy_loss(outputs.unsqueeze(dim=-1), mesh_labels.unsqueeze(dim=-1), ignore_index=-1)
        optim.step(loss)
        preds = np.argmax(outputs.data, axis=1)
        acc += np.sum((labels == preds).sum(axis=1) / meshes['Fs'])
        writer.add_scalar('loss', loss.data[0], global_step=train.step)
        train.step += 1
    acc /= dataset.total_len
    print('train acc = ', acc)
    writer.add_scalar('train-acc', acc, global_step=epoch)

In PD-MeshNet:

    # targets are the labels of each face 
    targets = primal_graph_batch.y
    # Compute number of correct predictions.
     num_correct_predictions = compute_num_correct_predictions(
                        task_type=self.__task_type,
                        outputs=outputs,
                        targets=targets)
    # The number of predictions corresponds to the number of samples
    # in the batch in case of mesh classification (in which a single
    # label is assigned to each shape) and to the number of total
    # mesh faces in the batch in case of mesh segmentation (in which
    # a label is assigned to each face).
    num_predictions_in_batch = targets.shape[0]
    total_num_correct_predictions += num_correct_predictions
    total_num_predictions += num_predictions_in_batch

Therefore, both your accuracy and PD-MeshNet accuracy are based on each face's label. What are the differences between your method's metric and the PD-MeshNet's metric?

lzhengning commented 2 years ago

You are right that both subdivnet and pdmeshnet use per-face accuracy. However, PDMesh is originally evaluated on the simplified mesh, which contains 1000+ faces. FYI, the original meshes in humanseg and coseg contains up to more than 10,000 faces.

lidan233 commented 2 years ago

I agree with you. I also notice the attention mechanism in PD-MeshNet is based on the local information. They use the local attention value as the metric to remove edges. Therefore, their method may be not fit for large meshes. Do you agree with this reason? In addition, I can only get the coseg data containing 1000+ faces. How do you get the meshes with high resolution? Which subdivision method do you choose? Thanks! Have a nice day!

lzhengning commented 2 years ago

PD-MeshNet was trained on the simplified meshes and I projected their segmentation results to the raw meshes. Because I found their code is not easy to be modified for the original humanseg and coseg datasets. So I cannot simply conclude that PD-MeshNet is not suitable for large meshes.

But I have tested MeshCNN to segment 10000-face meshes and MeshCNN works worse than on the simplified meshes. In my opinion, the subdivision-based pooling layer and the dilated convolution are the most important for large meshes, because both of them provides reasonable and larger receptive fields. This seems to be a more effective strategy than removing edges according to the intermediate feature, like MeshCNN and PD-MeshNet.

The original coseg dataset can be downloaded via http://irc.cs.sdu.edu.cn/~yunhai/public_html/ssl/ssd.htm. The Loop subdivision scheme, or 1-to-4 triangle splitting, is used all-over the paper while other subdivision scheme may also be compatible.