johnmyleswhite / kNN.jl

The k-nearest neighbors algorithm in Julia
Other
22 stars 17 forks source link

knn classifier accepts various data structures #13

Open wildart opened 10 years ago

wildart commented 10 years ago

Classifier can accept data structures form NearestNeighbors package.

wildart commented 10 years ago

I propose two new abstract types Classifier and Regressor to identify majority vote and averaging approach to results of particular prediction.

abstract Classifier
immutable kNNClassifier{T <: NearestNeighborTree} <: Classifier
    t::T
    y::Vector
end

Surely, Classifier must be interface that would define a method specification which handles results of a particular model prediction through majority vote or as-is. Same approach would apply to Regressor with averaging.

johnmyleswhite commented 10 years ago

The community's spent a lot of time discussing how to define things like Regressor in the past: https://github.com/JuliaStats/Roadmap.jl/issues/4

I'm not sure it's the most fruitful thing to do at the moment.

johnmyleswhite commented 10 years ago

Coming back to this, perhaps we should introduce even finer type distinctions? One might want to use majority voting with a parametric type that stores the predictions in a vector of the same type as the input labels (which could be arbitrary types). In contrast, the averaging/probability case seems to call for a very different data structure as output.

wildart commented 10 years ago

I agree with parametric type for labels in classification case. As for regression case it would always be numerical value even though implementations could output auxiliary data.