ChrisWu1997 / PQ-NET

code for our CVPR 2020 paper "PQ-NET: A Generative Part Seq2Seq Network for 3D Shapes"
MIT License
118 stars 18 forks source link

How to calculate the MMD, COV, and JSD #19

Closed yolu1055 closed 3 years ago

yolu1055 commented 3 years ago

Hi Rundi,

Hope you are doing well!

Thank you for sharing your great work!

I am now trying to mimic the experiments in your paper, and I have a question about calculating the MMD, COV and JSD.

Currently, I am calculating them as follows:

  1. Generate 2000 voxel samples in size 64^3.
  2. For each voxel in the test set or in the generated sample set, generate its corresponding mesh using Marching Cubes.
  3. Normalize each mesh to [-0.5, 0.5]^3.
  4. For each shape, use the sample point cloud method from Trimesh to sample 2000 points on surface.
  5. Calculate the MMD, COV, and JSD.

================================================================================

The dataset I am using is the Chair dataset you provide. I use your scripts to train the PQ-Net and GAN myself. I am pretty sure that my training is successful. Here are an example of sample point cloud and an example of ground truth point cloud. Sample from PQ-NET:

Ground truth shape from test set:

The code I am using for calculating the MMD, COV, JSD is directly adapted from this project: https://github.com/stevenygd/PointFlow

Now my problem is my scores are very different from yours reported in the paper. The score comparisons are as below:

Method COV MMD JSD
paper results 54.91 8.34 0.0083
my results 35.29 3.13 0.0073

The COV is multiplied by 10^2, and the MMD is multiplied by 10^3.

My JSD and MMD scores are much better than scores in your paper, while the COV score is much worse. I feel that I may use a wrong way to normalize the meshes, so that my Chamfer distance calculations are wrong. Could you give me some hints on how you calculate these three metrics in your experiments if possible? Thank you in advance!

Best regards,

ChrisWu1997 commented 3 years ago

Hi You,

Thanks for taking interest in our work!

Two comments on computing these three metrics:

  1. After converting voxel to point cloud, we normalize it within [-1, 1]^3 as we notice that the original PartNet shape lies in a sphere of radius 1. The below code does the normalization:
    def scale_to_unit_sphere(points, center=None):
    midpoints = (np.max(points, axis=0) + np.min(points, axis=0)) / 2
    # midpoints = np.mean(points, axis=0)
    points = points - midpoints
    scale = np.max(np.sqrt(np.sum(points ** 2, axis=1)))
    points = points / scale
    return points
  2. We use the code from https://github.com/optas/latent_3d_points/blob/master/src/evaluation_metrics.py for calculation, rather from PointFlow. Not sure if this matters.

Hope the above can help you. Feel free to raise any questions.

yolu1055 commented 3 years ago

Hi Rundi,

Thank you very much for your response! It seems that the main difference lies at the normalization. I will try your method. Thank you for sharing your code!

Best regards,

frankzl commented 3 years ago

Hi, do you remember what resolution you used to compute the jsd between point clouds?

ChrisWu1997 commented 3 years ago

Hi, do you remember what resolution you used to compute the jsd between point clouds?

It's 28, the default resolution here.

frankzl commented 3 years ago

Thanks for the answer! For evaluation using the Chamfer distance, did you sample 2000 lamp objects or only 800? (In the paper it's written that you used only 800 for the comparison with 3D-PRNN, but in the table for the Chamfer distance it says 2000)

ChrisWu1997 commented 3 years ago

For evaluation based on chamfer distance (Table 2 in the main text), it's 2000 for all three categories, this is for sure. For comparison with 3D-PRNN (Table 4 in the supp), which is evaluated based on 1-IoU, honestly I didn't remember quite clearly, but the paper writes 800 for lamp so I think it's 800.

duzhenjiang113 commented 3 years ago

Hi Rundi,

Hope you are doing well!

Thank you for sharing your great work!

I am now trying to mimic the experiments in your paper, and I have a question about calculating the MMD, COV and JSD.

Currently, I am calculating them as follows:

  1. Generate 2000 voxel samples in size 64^3.
  2. For each voxel in the test set or in the generated sample set, generate its corresponding mesh using Marching Cubes.
  3. Normalize each mesh to [-0.5, 0.5]^3.
  4. For each shape, use the sample point cloud method from Trimesh to sample 2000 points on surface.
  5. Calculate the MMD, COV, and JSD.

================================================================================

The dataset I am using is the Chair dataset you provide. I use your scripts to train the PQ-Net and GAN myself. I am pretty sure that my training is successful. Here are an example of sample point cloud and an example of ground truth point cloud. Sample from PQ-NET:

Ground truth shape from test set:

The code I am using for calculating the MMD, COV, JSD is directly adapted from this project: https://github.com/stevenygd/PointFlow

Now my problem is my scores are very different from yours reported in the paper. The score comparisons are as below:

Method COV MMD JSD paper results 54.91 8.34 0.0083 my results 35.29 3.13 0.0073 The COV is multiplied by 10^2, and the MMD is multiplied by 10^3.

My JSD and MMD scores are much better than scores in your paper, while the COV score is much worse. I feel that I may use a wrong way to normalize the meshes, so that my Chamfer distance calculations are wrong. Could you give me some hints on how you calculate these three metrics in your experiments if possible? Thank you in advance!

Best regards,

Hi, Could you please tell me how to collect the point cloud from the mesh and how to trans voxel to mesh? thanks again!!!

ChrisWu1997 commented 3 years ago

@duzhenjiang113 You can use trimesh.sample.sample_surface to collect point clouds from mesh. And use marching cubes to get meshes from voxel.

duzhenjiang113 commented 3 years ago

@duzhenjiang113 You can use trimesh.sample.sample_surface to collect point clouds from mesh. And use marching cubes to get meshes from voxel.

thanks

QifHE commented 3 years ago

Hi Wu,

Thank you for your great work! However, I was trying to calculate these metrics based on your paper while encountered some problems.

These are my results. The COV is multiplied by 100, and the MMD is multiplied by 1000. As you can see, there is something seriously wrong with my JSD.

Category Method COV MMD JSD
Lamp paper results 87.95 10.01 0.0021
my results 88.44 9.217 0.0156

This is what I did:

  1. I followed your readme.md and yolu1055 in this issue to generate 2000 voxel samples ("/PQ-NET/proj_log/pqnet-PartNet-Lamp/results/fake_z_ckpt80000_num2000-mesh-p1/*.h5"), and generate mesh shapes using marching cubes as you mentioned.
  2. Then I used trimesh.sample.sample_surface to collect 2000 points from the mesh.
  3. I split the test dataset using "/PQ-NET/data/train_val_test_split/*.test.json" from the dataset you provided, from which I got 255 Lamps and 1217 Chairs.
  4. I extracted the "shapevoxel64" voxel shapes from the h5 files in the test dataset, and did the same process in 1~2 step, using them as the ground truth shapes.
  5. I normalized both the ground truth and generated point cloud shapes using the function you provided above: https://github.com/ChrisWu1997/PQ-NET/issues/19#issuecomment-805902287
  6. I transferred the processed data to latent_3d_points and built my environment with Python 2.7, Tensorflow 1.10.0, and CUDA 9.0.
  7. Finally I run latent_3d_points code for metric computation and got the results.

I wonder if you could provide any insight into my problem? Thanks!


Sorry, I actually just found out that I misread the number of JSD on your paper and my results are actually very close to your numbers. So, there is no problem.

ChrisWu1997 commented 3 years ago

Hi @Mistral-Twirl,

Your workflow sounds correct. Did you use our pretrained model? Also the paper used the point clouds sampled from PartNet meshes as the reference (your step 4), but I guess sampling from ground truth voxel won't make much difference.

Anyway, I used the pretrained model and evaluate the results myself. This time I also sampled test point clouds from voxels, and got 225 Lamps and 1217 Chairs. The results is shown below, on bar with the paper (slightly better).

Category COV MMD JSD
Chair 58.91 8.24 0.0051
Lamp 87.55 8.90 0.0162

I pushed the evaluation code here, you can have a try. For example:

# convert test set voxel to point clouds
python vox2pc.py --src ../data/Lamp --category Lamp --test_data
# convert generated voxel to point clouds
python vox2pc.py --src ../proj_log/pqnet-PartNet-Lamp/results/fake_z_ckpt80000_num2000-voxel-p0
# compute COV, MMD, JSD
python cov_mmd_partnet.py -g 0 --class_name Lamp --src ../proj_log/pqnet-PartNet-Lamp/results/fake_z_ckpt80000_num2000-voxel-p0 --test_pc ../data/Lamp_pc
QifHE commented 3 years ago

Hi @Mistral-Twirl,

Your workflow sounds correct. Did you use our pretrained model? Also the paper used the point clouds sampled from PartNet meshes as the reference (your step 4), but I guess sampling from ground truth voxel won't make much difference.

Anyway, I used the pretrained model and evaluate the results myself. This time I also sampled test point clouds from voxels, and got 225 Lamps and 1217 Chairs. The results is shown below, on bar with the paper (slightly better).

Category COV MMD JSD Chair 58.91 8.24 0.0051 Lamp 87.55 8.90 0.0162 I pushed the evaluation code here, you can have a try. For example:

# convert test set voxel to point clouds
python vox2pc.py --src ../data/Lamp --category Lamp --test_data
# convert generated voxel to point clouds
python vox2pc.py --src ../proj_log/pqnet-PartNet-Lamp/results/fake_z_ckpt80000_num2000-voxel-p0
# compute COV, MMD, JSD
python cov_mmd_partnet.py -g 0 --class_name Lamp --src ../proj_log/pqnet-PartNet-Lamp/results/fake_z_ckpt80000_num2000-voxel-p0 --test_pc ../data/Lamp_pc

Hi Wu,

Thank you for your reply! I just hid my comment because it turned out that I misread your JSD score 0.021 as 0.0021, which was roughly 10 times smaller than my expectation. My experiment results now are very similar to yours, so there is not any problem. Sorry for my trouble.