JuliaStats / NMF.jl

A Julia package for non-negative matrix factorization
Other
90 stars 34 forks source link

nndsvd: allow initialization with externally-supplied factorization #70

Closed timholy closed 1 year ago

timholy commented 1 year ago

rsvd is fast, but it's also noisy (i.e., not deterministic) and of lower quality than exact SVD. Particularly if you have already computed the SVD for other reasons, it's nice to be able to use it in initializing via nndsvd.

This starts with the abandoned #68 and brings it over the finish line. CC @youdongguo.

This will not pass tests on its own, but if you merge #69 first, close this, and then reopen, it should pass.

There are quite a few trailing-whitespace changes; you can change the settings (the "gear" icon under Files changed) to omit those for easier review.

codecov-commenter commented 1 year ago

Codecov Report

Base: 89.70% // Head: 92.37% // Increases project coverage by +2.66% :tada:

Coverage data is based on head (167ceb6) compared to base (8629a5d). Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #70 +/- ## ========================================== + Coverage 89.70% 92.37% +2.66% ========================================== Files 10 11 +1 Lines 408 708 +300 ========================================== + Hits 366 654 +288 - Misses 42 54 +12 ``` | [Impacted Files](https://codecov.io/gh/JuliaStats/NMF.jl/pull/70?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=JuliaStats) | Coverage Δ | | |---|---|---| | [src/initialization.jl](https://codecov.io/gh/JuliaStats/NMF.jl/pull/70?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=JuliaStats#diff-c3JjL2luaXRpYWxpemF0aW9uLmps) | `100.00% <100.00%> (ø)` | | | [src/interf.jl](https://codecov.io/gh/JuliaStats/NMF.jl/pull/70?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=JuliaStats#diff-c3JjL2ludGVyZi5qbA==) | `92.98% <100.00%> (+1.31%)` | :arrow_up: | | [src/multupd.jl](https://codecov.io/gh/JuliaStats/NMF.jl/pull/70?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=JuliaStats#diff-c3JjL211bHR1cGQuamw=) | `96.59% <0.00%> (-3.41%)` | :arrow_down: | | [src/utils.jl](https://codecov.io/gh/JuliaStats/NMF.jl/pull/70?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=JuliaStats#diff-c3JjL3V0aWxzLmps) | `84.09% <0.00%> (-1.10%)` | :arrow_down: | | [src/projals.jl](https://codecov.io/gh/JuliaStats/NMF.jl/pull/70?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=JuliaStats#diff-c3JjL3Byb2phbHMuamw=) | `100.00% <0.00%> (ø)` | | | [src/greedycd.jl](https://codecov.io/gh/JuliaStats/NMF.jl/pull/70?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=JuliaStats#diff-c3JjL2dyZWVkeWNkLmps) | `100.00% <0.00%> (ø)` | | | [src/NMF.jl](https://codecov.io/gh/JuliaStats/NMF.jl/pull/70?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=JuliaStats#diff-c3JjL05NRi5qbA==) | `100.00% <0.00%> (ø)` | | | [src/coorddesc.jl](https://codecov.io/gh/JuliaStats/NMF.jl/pull/70?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=JuliaStats#diff-c3JjL2Nvb3JkZGVzYy5qbA==) | `100.00% <0.00%> (+3.03%)` | :arrow_up: | | ... and [3 more](https://codecov.io/gh/JuliaStats/NMF.jl/pull/70?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=JuliaStats) | | Help us with your feedback. Take ten seconds to tell us [how you rate us](https://about.codecov.io/nps?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=JuliaStats). Have a feature suggestion? [Share it here.](https://app.codecov.io/gh/feedback/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=JuliaStats)

:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

ararslan commented 1 year ago

Any idea what's going on with this test (added in this PR) only on Linux with 1.8?

Test Failed at /home/runner/work/NMF.jl/NMF.jl/test/initialization.jl:48
  Expression: norm(X - W2 * H2) < norm(X - W1 * H1)
   Evaluated: 2.0457485837143 < 2.039189688675492
timholy commented 1 year ago

Nope. I can't replicate the failure locally. Let's try again and see if it repeats?

timholy commented 1 year ago

This time, they all passed.

What's interesting is that the random seed isn't set before construction of X for Float64; it gets set during the Float64 tests and then makes the X for Float32 deterministic, but Float64 is not deterministic. I think before merging this it might make sense to standardize all this. Since we set the seed in @testsets, I think it's high time to transition the tests to @testsets. I'll do that first before this merges (EDIT: I'll need to fix a conflict once #72 merges).

@ararslan, are you otherwise OK with it?

ararslan commented 1 year ago

@ararslan, are you otherwise OK with it?

Sure am!

timholy commented 1 year ago

Weird. There are non-reproducible failures. Did we just get lucky with #72?