Earth Mover's Distance (EMD) loss

cihanongun commented 4 years ago

🚀 Feature Request

Earth Mover's Distance (EMD) is a popular loss metric for comparing point clouds alongside Chamfer Distance. Popular studies [1,2] use both distances for point cloud generation and reconstruction. Also, it is indicated by these studies that "the CD distance is less faithful than EMD to the visual quality of synthetic results." I believe EMD loss would be a valuable contribution to the library. Some implementations: https://github.com/Colin97/MSN-Point-Cloud-Completion/tree/master/emd https://github.com/daerduoCarey/PyTorchEMD

[1] https://arxiv.org/abs/1612.00603 [2] https://arxiv.org/abs/1707.02392

gkioxari commented 4 years ago

Hi @congun You are right. EMD is a useful 3D metric. We have looked into EMD and existing implementations in great depth. There is currently a lot of issues with existing EMD implementations, such as the once you cite, which I summarize below.

Existing EMD losses don't guarantee correctness. The most popular EMD implementation, used also in the second github link you provide, has no correctness guarantees and has no correlation with the relaxation algorithm from Bertsekas. I have brought this issue up https://github.com/fxia22/pointGAN/issues/5 and https://github.com/NVIDIAGameWorks/kaolin/issues/149. I do not encourage you to use this implementation as it is not correct and will lead to an avalanche of wrongly reported numbers and comparisons in the 3D community. Wrong benchmarks are worse than no benchmarks.
Iterative approaches are very approximate. There is a family of implementations, such as the first github link you provide, which are iterative and very approximate. These have no guarantee of convergence and different hyperparameters, which are needed when defining these implementations, can change the solution drastically. In my view, these attributes are very problematic and can lead to more noise in the 3D community.
Exact solutions don't scale well with size. The biggest challenge in implementing a correct EMD is the fact that a naive solution will not scale well with the size of points, both for time and memory. There is some recent technical papers from parallel computing conferences which use efficient cuda kernels for accurate assignment but are also limited to max 8k points.

We do not want to provide an EMD implementation which is not correct, well tested, robust, and efficient. The above solutions you cite do not meet our quality bar, unfortunately. It seems that good implementation for EMD that works for medium sized pointclouds (5k points and above) will require a lot of work, potentially groundbreaking and publication worthy!

cihanongun commented 4 years ago

Thank you very much for detailed explanation.

aluo-x commented 4 years ago

There is this: geomloss which gives differentiable optimal transport metrics for point clouds, which you can differentiably sample from meshes.

I think it is still too slow for training at reasonable point cloud sizes (~10K), but it may be sufficient for evaluation.

BoodgionWood commented 2 years ago

Hi @gkioxari I am wondering if there is any update on EMD loss for point cloud?

You mentioned that "The biggest challenge in implementing a correct EMD is the fact that a naive solution will not scale well with the size of points". Is there any implementation of naive solution that you think work well with point cloud of small size, e.g., 1k-2k points?

Thank you so much. I would like to use that as part of my newly developed point cloud GAN algorithm. The algorithm doesn't need to be tested on point cloud of very large size at the first step. So an implementation of naive method could be of great help. Thanks!

OswaldoBornemann commented 1 year ago

The same questions.

facebookresearch / pytorch3d

Earth Mover's Distance (EMD) loss #211

🚀 Feature Request