madeleineudell / LowRankModels.jl

LowRankModels.jl is a julia package for modeling and fitting generalized low rank models.
Other
190 stars 65 forks source link

OrdinalDomain does not take real ordinal values in interval of length less than 2 #82

Open kmundnic opened 6 years ago

kmundnic commented 6 years ago

Hi, OrdinalDomain only takes Ints as min and max values, but this does not work in the case where ordinal values may be Floats. For example, having ordinal values [0.0, 0.1, 0.2, 0.3] will result in an inaccurate warn (domains.jl:40) and consequent error.

The way I'm thinking to solve this is to define

# Ordinal data should take integer values ranging from `min` to `max`
immutable OrdinalDomain<:Domain
    min::Real
    max::Real
    function OrdinalDomain(elements)
        if length(elements) < 2
            warn("The ordinal variable you've created is degenerate: it has only two levels. Consider using a Boolean variable instead; ordinal loss functions may have unexpected behavior on a degenerate ordinal domain.")
        end
        return new(minimum(elements), maximum(elements))
    end
end

Any comments/concerns on this fix?

madeleineudell commented 5 years ago

Right now, the min and max values are used to index into a vector; they have to be Ints! The simple fix is to transform your data, mapping your ordinal values to consecutive integers.

The more complex fix would be to do this internally inside the GLRM. You'd probably have to add a new field with the original data, and define functions mapping each column back and forth from the original domain to the transformed domain and back. Let glrm.A be the transformed data, glrm.df be the original data. Methods like impute, impute_missing, sample, and sample_missing should first impute glrm.A, then transform the data back to the original coordinates using the define map, and return glrm.df.