ahwillia commented 9 years ago

Here is a quick implementation of nndsvd.

We should probably develop a few test cases to make sure this is working as expected before merging. But it seems to be behaving reasonably in my hands so far...

Note: Sorry for the duplicate pull request. I'll delete the other one.

ahwillia commented 9 years ago

I validated this against the initialization of NMF.jl. Note that you have to change the source code to export the nndsvd function from the module.

ahwillia commented 9 years ago

@madeleineudell -- would it be better to just import NMF.nndsvd and just use that within the init_nnmf! function?

madeleineudell commented 9 years ago

Code reuse is always a good idea. Why don't we

import NMF.nndsvd
init_nndsvd! = NMF.nndsvd

in keeping with the fact that the other initializations have names referring to the initialization algorithm rather than the type of problem they're intended for?

On Sat, Aug 1, 2015 at 7:46 PM, Alex Williams notifications@github.com wrote:

@madeleineudell https://github.com/madeleineudell -- would it be better to just import NMF.nndsvd and just use that within the init_nnmf! function?

— Reply to this email directly or view it on GitHub https://github.com/madeleineudell/LowRankModels.jl/pull/24#issuecomment-126982632 .

Madeleine Udell www.stanford.edu/~udell

ahwillia commented 9 years ago

Sounds good. What about scaling the each column of the data matrix, A by the loss.scale parameter?

https://github.com/ahwillia/LowRankModels.jl/commit/a193eee74a77688ddbeb2e9a0e917825b4421323#diff-830d44f680dffd2477e41c071bf3fd35R88

We could do:

import NMF.nndsvd

function init_nndsvd!(glrm::GLRM; scaling::Bool=true,
                      zeroh::Bool=false, variant::Symbol=:std)
    # only initialize based on observed entries
    A_sparse = zeros(size(glrm.A))
    for i = 1:n
        A_sparse[glrm.observed_examples[i],i] = glrm.A[glrm.observed_examples[i],i]
    end

    # scale all columns by the Loss.scale parameter
    if scaling
        for i = 1:n
            A_sparse[:,i] ./= glrm.losses[i].scale
        end
    end

    W,H = nndsvd(A_sparse, glrm.k, variant=variant)

    glrm.X = W'
    glrm.Y = H
end

madeleineudell commented 9 years ago

Scaling by loss.scale is a good idea: but

Let's use scale as the keyword to be consistent.
I think you want to multiply, not divide, by scale: eg scale = 0 means we don't care at all about that column; whereas here we'd divide by 0 which means we care infinitely.

Even more interesting variants:

We could iteratively apply nndsvd: take the W and H returned, and replace the unobserved entries of A_sparse with the entries of WH. Repeat until convergence or user timeout (max iters).

This is a popular method for matrix completion ("soft impute") that makes the initial values chosen for unobserved entries matter less with each iteration; but I've never seen it used for NNMF...

On Mon, Aug 3, 2015 at 12:00 PM, Alex Williams notifications@github.com wrote:

Sounds good. What about scaling the each column of the data matrix, A by the loss.scale parameter?

ahwillia@a193eee#diff-830d44f680dffd2477e41c071bf3fd35R88 https://github.com/ahwillia/LowRankModels.jl/commit/a193eee74a77688ddbeb2e9a0e917825b4421323#diff-830d44f680dffd2477e41c071bf3fd35R88

We could do:

import NMF.nndsvd function init_nndsvd!(glrm::GLRM; scaling::Bool=true, zeroh::Bool=false, variant::Symbol=:std)

only initialize based on observed entries
A_sparse = zeros(size(glrm.A))
for i = 1:n
    A_sparse[glrm.observed_examples[i],i] = glrm.A[glrm.observed_examples[i],i]
end
# scale all columns by the Loss.scale parameter
if scaling
    for i = 1:n
        A_sparse[:,i] ./= glrm.losses[i].scale
    end
end

W,H = nndsvd(A_sparse, glrm.k, variant=variant)

glrm.X = W'
glrm.Y = Hend
— Reply to this email directly or view it on GitHub https://github.com/madeleineudell/LowRankModels.jl/pull/24#issuecomment-127370827 .

Madeleine Udell www.stanford.edu/~udell

ahwillia commented 9 years ago

import NMF.nndsvd

function init_nndsvd!(glrm::GLRM; scale::Bool=true, zeroh::Bool=false,
                      variant::Symbol=:std, max_iters::Int=0)
    # only initialize based on observed entries
    A_init = zeros(size(glrm.A))
    for i = 1:n
        A_init[glrm.observed_examples[i],i] = glrm.A[glrm.observed_examples[i],i]
    end

    # scale all columns by the Loss.scale parameter
    if scale
        for i = 1:n
            A_init[:,i] .*= glrm.losses[i].scale
        end
    end

    # run the nndsvd initialization 
    W,H = nndsvd(A_init, glrm.k, zeroh=zeroh, variant=variant)

    # If max_iters>0 do a soft impute for the missing entries of A.
    #   Iterate: Estimate A as the product of W*H
    #            Update (W,H) nndsvd estimate based on new A
    for _ = 1:max_iters
        A_init = W*H 
        W,H = nndsvd(A_init, glrm.k, zeroh=zeroh, variant=variant)
    end

    glrm.X = W'
    glrm.Y = H
end

madeleineudell commented 9 years ago

Almost: but we want to keep the entries of A_init that we did observe fixed as we perform soft impute. ``` for _ = 1:max_iters

this is inefficient when there are more observations than missing

entries; in which case it's better to start with the previous A_init and just splice in the unobserved entries from W_H; but this is the right idea A_init = W_H for i = 1:n A_init[glrm.observed_examples[i],i] = glrm.A[glrm.observed_examples[i],i] end W,H = nndsvd(A_init, glrm.k, zeroh=zeroh, variant=variant) end


On Mon, Aug 3, 2015 at 12:36 PM, Alex Williams <notifications@github.com>
wrote:

> import NMF.nndsvd
> function init_nndsvd!(glrm::GLRM; scale::Bool=true, zeroh::Bool=false,
>                       variant::Symbol=:std, max_iters::Int=0)
>     # only initialize based on observed entries
>     A_init = zeros(size(glrm.A))
>     for i = 1:n
>         A_init[glrm.observed_examples[i],i] = glrm.A[glrm.observed_examples[i],i]
>     end
>
>     # scale all columns by the Loss.scale parameter
>     if scale
>         for i = 1:n
>             A_init[:,i] .*= glrm.losses[i].scale
>         end
>     end
>
>     # run the nndsvd initialization
>     W,H = nndsvd(A_init, glrm.k, zeroh=zeroh, variant=variant)
>
>
>     # If max_iters>0 do a soft impute for the missing entries of A.
>     #   Iterate: Estimate A as the product of W*H
>     #            Update (W,H) nndsvd estimate based on new A
>     for _ = 1:max_iters
>         A_init = W*H
>         W,H = nndsvd(A_init, glrm.k, zeroh=zeroh, variant=variant)
>     end
>
>     glrm.X = W'
>     glrm.Y = Hend
>
> —
> Reply to this email directly or view it on GitHub
> <https://github.com/madeleineudell/LowRankModels.jl/pull/24#issuecomment-127382231>
> .
>

-- 
Madeleine Udell
www.stanford.edu/~udell

ahwillia commented 9 years ago

Abbreviated version of latest commit

    W,H = nndsvd(A_init, glrm.k, zeroh=zeroh, variant=variant)
    glrm.X = W'
    glrm.Y = H

    for iter = 1:max_iters
        for j = 1:n
            for i = setdiff(1:m,glrm.observed_examples[j])
                A_init[i,j] = dot(glrm.X[:,i],glrm.Y[:,j])
            end
        end
        W,H = nndsvd(A_init, glrm.k, zeroh=zeroh, variant=variant)
        glrm.X = W'
        glrm.Y = H
    end

madeleineudell / LowRankModels.jl

initial implementation of nndsvd #24

only initialize based on observed entries

this is inefficient when there are more observations than missing