Closed davibarreira closed 3 years ago
Totals | |
---|---|
Change from base Build 911721124: | 0.1% |
Covered Lines: | 462 |
Relevant Lines: | 488 |
Merging #95 (d5d57c7) into master (24ecf5d) will decrease coverage by
0.21%
. The diff coverage is100.00%
.
@@ Coverage Diff @@
## master #95 +/- ##
==========================================
- Coverage 94.88% 94.67% -0.22%
==========================================
Files 6 7 +1
Lines 430 488 +58
==========================================
+ Hits 408 462 +54
- Misses 22 26 +4
Impacted Files | Coverage Δ | |
---|---|---|
src/utils.jl | 100.00% <100.00%> (ø) |
|
src/entropic.jl | ||
src/entropic/sinkhorn.jl | 97.98% <0.00%> (ø) |
|
src/entropic/sinkhorn_stabilized.jl | 94.33% <0.00%> (ø) |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 24ecf5d...d5d57c7. Read the comment docs.
@devmotion , I've implemented the changes requested. But I was not sure on how to implement the RowVecs
and ColVecs
standard. I still have to take a better look at how KernelFunctions.jl
does it. Is your idea for us to implement it directly in our package, or to use the functions from KernelFunctions.jl
?
I've removed the normalization and the default assignment of equal probability to all observations in case no prob vector is provided, but I still think they are good features to have. @zsteve , what do you think?
@devmotion , I've implemented the changes requested. But I was not sure on how to implement the
RowVecs
andColVecs
standard. I still have to take a better look at howKernelFunctions.jl
does it. Is your idea for us to implement it directly in our package, or to use the functions fromKernelFunctions.jl
?
No, this should not be reimplemented. Users should e.g. use ArraysOfArrays.jl or KernelFunctions.jl. And hopefully soon ColVecs and RowVecs will be moved to their own package, we discussed this a while ago and IIRC there is an open issue.
I've removed the normalization and the default assignment of equal probability to all observations in case no prob vector is provided, but I still think they are good features to have. @zsteve , what do you think?
I think the normalization is not needed for many algorithms and can easily performed manually if needed (e.g. with LinearAlgebra.normalize!
), therefore I would not include it in this PR. Additionally, the initial implementation in the PR caused a type instability which can be avoided in this way.
I am fine with adding a default for the probabilities, you could just change probs::AbstractVector{<:Real}
to probs::AbstractVector{<:Real}=Fill(inv(length(support)), length(support))
.
Cool, I'll do the modifications right now. I'd like to get this PR through asap, so I can go back to the other implementations :)
There is some compatibility issue, but I cannot reproduce in my machine. It seems to be related to KernelFunctions
, which I used in the test.
KernelFunctions requires Julia >= 1.3 (this is one reason for why we want to extract ColVecs and RowVecs to a separate package). It seems ArraysOfArrays also supports Julia 1.0, so it would be better to use it in the tests until ColVecs and RowVecs are compatible with Julia 1.0.
If there are no more proposed changes, I think this is ready to be merged :D
KernelFunctions requires Julia >= 1.3 (this is one reason for why we want to extract ColVecs and RowVecs to a separate package). It seems ArraysOfArrays also supports Julia 1.0, so it would be better to use it in the tests until ColVecs and RowVecs are compatible with Julia 1.0.
@devmotion would you know if the ColVecs and RowVecs have been moved to another package? I'm writing some extra documentation on the 1D case, and I remembered this.
This PR has the implementation of the
FiniteDiscreteMesure
struct which I talked about in #92. It is similar to theDiscreteNonParametric
type inDistributions.jl
. It contains asupport
and ap
(for probability), allowing the user to store the empirical distributions. This would allow use to write functions such assinkhorn2(c, mu::FiniteDiscreteMeasure, nu::FiniteDiscreteMeasure)
, without having to pass both support and probabilities separately.p.s: I don't feel strongly about the
FiniteDiscreteMeasure
naming. Feel free to give better suggestions. I just don't thinkDiscreteMeasure
is a good name, because our variable cannot have infinite support, like a Poisson distribution.SimpleMeasure
is another option and would be theoretically correct.EmpiricalMeasure
andSampleMeasure
are another option, but they are kind of ambiguous, since the measure does not need necessarily to be either empirical or from a sampling process.