lmcinnes / pynndescent

A Python nearest neighbor descent for approximate nearest neighbors
BSD 2-Clause "Simplified" License
878 stars 105 forks source link

Error with Wasserstein distance #104

Open anlarro opened 3 years ago

anlarro commented 3 years ago

When I try to use "wasserstein" distance I get the error: "Optimal transport problem was INFEASIBLE. Please check " "inputs.". What exactly should I check? How should the inputs be? If I use other distances like "euclidean" or "manhattan" I don't get any errors.

sgbaird commented 3 years ago

Might help if you posted a reproducible example (and verified the latest version still has this issue)

lmcinnes commented 3 years ago

The most likely cause would be an all zeros vector. Problems are infeasible if there is no way to transport one probability distribution to the other. If one of the items can't be expressed as a distribution because it is all zero, so upon normalization ends up with NaNs, or has negative values, that would be ne way to be infeasible. Another option would be if the cost matrix has NaNs, of infs in it, that makes certain transportations impossible.