PythonOT / POT

POT : Python Optimal Transport
https://PythonOT.github.io/
MIT License
2.39k stars 497 forks source link

Questions for ot.barycenter #633

Closed peteryang1031 closed 2 months ago

peteryang1031 commented 3 months ago

I have a few questions regarding the use of the ot.barycenter function.

1. Cost Matrix with More Than Two Distributions:

Suppose I have more than two distributions. How can I compute the cost matrix for calculating the barycenter? I am particularly confused about the computation of matrix M when the number of distributions exceeds two.

2. Barycenter with Different Number of Points:

Assume I have two discrete distributions: A with 300 points (each with 30 dimensions) and B with 200 points (each with 30 dimensions). How can I compute the barycenter between these two distributions with a different number of points?

I am gradually learning to use POT and intend to implement it in my project. I appreciate any guidance you can provide on these questions.

cedricvincentcuaz commented 2 months ago

Hello @peteryang1031 , Computing the barycenter of several distributions of R^d requires to explicit the cost function to compare two points in R^d. So generic cost matrices M (pairwise relationships between points across distributions) as in ot.emd are not supported in POT. You can find implementation with an euclidean cost e.g here : https://pythonot.github.io/auto_examples/barycenters/plot_free_support_barycenter.html#sphx-glr-auto-examples-barycenters-plot-free-support-barycenter-py

And also different variants which assume/leverage specific properties of the domain, e.g projections on subspaces or points over grids here: https://pythonot.github.io/auto_examples/index.html#wasserstein-barycenters

Best, Cédric