snotskie / EpistemicNetworkAnalysis.jl

Native implementation of Epistemic Network Analysis written in the Julia programming language. Based on rENA 0.2.0.1.
https://snotskie.github.io/EpistemicNetworkAnalysis.jl/
GNU General Public License v3.0
6 stars 2 forks source link

For non-SVD rotations, Total variance explained exceeds 1.0 and SVD axes not in correct order #56

Closed snotskie closed 7 months ago

snotskie commented 7 months ago

For example,

image

And adding this as a test to runtests gives:

image

snotskie commented 7 months ago

Update: It appears that this only happens when the rotation has multiple axes before SVD, like a double means rotation, but not a single means rotation

snotskie commented 7 months ago

For the non-ortho problem, the cause was this:

for j in (i+1):numExistingDims
    vj = Vector(model.embedding[j, edgeIDs])
    s = sqrt(sum(vj .^ 2))
    if s != 0
        model.embedding[j, edgeIDs] .= vj / s
    end

    scalar = dot(vj, vi) / denom
    model.embedding[j, edgeIDs] .= vj - scalar * vi
    s = sqrt(sum(vj .^ 2))
    if s < 0.05
        model.embedding[j, edgeIDs] .= 0
        @warn "During the rotation step, axis $j was deflated to zero due to close correlation with axis $i."
    elseif s != 0
        model.embedding[j, edgeIDs] .= vj / s
    end
end

Everytime model.embedding[j, edgeIDs] was being updated, the shorthand vj was not also being updated, so the math based on vj was using only the original values of model.embedding[j, edgeIDs]

The fix:

for j in (i+1):numExistingDims
    vj = Vector(model.embedding[j, edgeIDs])
    s = sqrt(sum(vj .^ 2))
    if s != 0
        model.embedding[j, edgeIDs] .= vj / s
        vj = Vector(model.embedding[j, edgeIDs])
    end

    scalar = dot(vj, vi) / denom
    model.embedding[j, edgeIDs] .= vj - scalar * vi
    vj = Vector(model.embedding[j, edgeIDs])
    s = sqrt(sum(vj .^ 2))
    if s < 0.05
        model.embedding[j, edgeIDs] .= 0
        vj = Vector(model.embedding[j, edgeIDs])
        @warn "During the rotation step, axis $j was deflated to zero due to close correlation with axis $i."
    elseif s != 0
        model.embedding[j, edgeIDs] .= vj / s
        vj = Vector(model.embedding[j, edgeIDs])
    end
end
snotskie commented 7 months ago

partial fix staged in https://github.com/snotskie/EpistemicNetworkAnalysis.jl/commit/b0ee81e752356162dbc2615573e3cb4c0aa9a450

for ENAModel, tests confirm that axes have correct variance, ortho, and order

but for BiplotENAModel{FormulaRotation}, variance explained is too high

scratching my head about how that could happen

snotskie commented 7 months ago

Some findings. I have a Biplot model with 6 codes. Sometimes, everything is correct:

image

Sometimes, not:

image

What appears to be happening is that a surprise 7th dimension is being added, which should never happen with 6 codes in a biplot model. Chopping the excess dimensions off, I believe, should fix the problem, though I am not sure about the root cause. My best guess is it has to deal with precision errors

snotskie commented 7 months ago

candidate solution staged https://github.com/snotskie/EpistemicNetworkAnalysis.jl/commit/6a2ce1d550c0f8a24ee95254584c525b4c2f17e6