JuliaOptimalTransport / OptimalTransport.jl

Optimal transport algorithms for Julia
https://juliaoptimaltransport.github.io/OptimalTransport.jl/dev
MIT License
93 stars 8 forks source link

Finite Discrete Measure #95

Closed davibarreira closed 3 years ago

davibarreira commented 3 years ago

This PR has the implementation of the FiniteDiscreteMesure struct which I talked about in #92. It is similar to the DiscreteNonParametric type in Distributions.jl. It contains a supportand a p (for probability), allowing the user to store the empirical distributions. This would allow use to write functions such as sinkhorn2(c, mu::FiniteDiscreteMeasure, nu::FiniteDiscreteMeasure), without having to pass both support and probabilities separately.

p.s: I don't feel strongly about the FiniteDiscreteMeasure naming. Feel free to give better suggestions. I just don't think DiscreteMeasure is a good name, because our variable cannot have infinite support, like a Poisson distribution. SimpleMeasure is another option and would be theoretically correct. EmpiricalMeasureand SampleMeasure are another option, but they are kind of ambiguous, since the measure does not need necessarily to be either empirical or from a sampling process.

coveralls commented 3 years ago

Pull Request Test Coverage Report for Build 914791279


Totals Coverage Status
Change from base Build 911721124: 0.1%
Covered Lines: 462
Relevant Lines: 488

💛 - Coveralls
codecov-commenter commented 3 years ago

Codecov Report

Merging #95 (d5d57c7) into master (24ecf5d) will decrease coverage by 0.21%. The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #95      +/-   ##
==========================================
- Coverage   94.88%   94.67%   -0.22%     
==========================================
  Files           6        7       +1     
  Lines         430      488      +58     
==========================================
+ Hits          408      462      +54     
- Misses         22       26       +4     
Impacted Files Coverage Δ
src/utils.jl 100.00% <100.00%> (ø)
src/entropic.jl
src/entropic/sinkhorn.jl 97.98% <0.00%> (ø)
src/entropic/sinkhorn_stabilized.jl 94.33% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 24ecf5d...d5d57c7. Read the comment docs.

davibarreira commented 3 years ago

@devmotion , I've implemented the changes requested. But I was not sure on how to implement the RowVecsand ColVecsstandard. I still have to take a better look at how KernelFunctions.jl does it. Is your idea for us to implement it directly in our package, or to use the functions from KernelFunctions.jl?

I've removed the normalization and the default assignment of equal probability to all observations in case no prob vector is provided, but I still think they are good features to have. @zsteve , what do you think?

devmotion commented 3 years ago

@devmotion , I've implemented the changes requested. But I was not sure on how to implement the RowVecsand ColVecsstandard. I still have to take a better look at how KernelFunctions.jl does it. Is your idea for us to implement it directly in our package, or to use the functions from KernelFunctions.jl?

No, this should not be reimplemented. Users should e.g. use ArraysOfArrays.jl or KernelFunctions.jl. And hopefully soon ColVecs and RowVecs will be moved to their own package, we discussed this a while ago and IIRC there is an open issue.

I've removed the normalization and the default assignment of equal probability to all observations in case no prob vector is provided, but I still think they are good features to have. @zsteve , what do you think?

I think the normalization is not needed for many algorithms and can easily performed manually if needed (e.g. with LinearAlgebra.normalize!), therefore I would not include it in this PR. Additionally, the initial implementation in the PR caused a type instability which can be avoided in this way.

I am fine with adding a default for the probabilities, you could just change probs::AbstractVector{<:Real} to probs::AbstractVector{<:Real}=Fill(inv(length(support)), length(support)).

davibarreira commented 3 years ago

Cool, I'll do the modifications right now. I'd like to get this PR through asap, so I can go back to the other implementations :)

davibarreira commented 3 years ago

There is some compatibility issue, but I cannot reproduce in my machine. It seems to be related to KernelFunctions, which I used in the test.

devmotion commented 3 years ago

KernelFunctions requires Julia >= 1.3 (this is one reason for why we want to extract ColVecs and RowVecs to a separate package). It seems ArraysOfArrays also supports Julia 1.0, so it would be better to use it in the tests until ColVecs and RowVecs are compatible with Julia 1.0.

davibarreira commented 3 years ago

If there are no more proposed changes, I think this is ready to be merged :D

davibarreira commented 2 years ago

KernelFunctions requires Julia >= 1.3 (this is one reason for why we want to extract ColVecs and RowVecs to a separate package). It seems ArraysOfArrays also supports Julia 1.0, so it would be better to use it in the tests until ColVecs and RowVecs are compatible with Julia 1.0.

@devmotion would you know if the ColVecs and RowVecs have been moved to another package? I'm writing some extra documentation on the 1D case, and I remembered this.