Closed rofinn closed 3 years ago
@appleparan This may be relevant to you ☝️
All review comments applied except from @view
/@views
since there wasn't a clear performance benefit during quick benchmarking. I'll merge when tests pass.
That's good! Thanks!
ismissing
on impute data, so it would never identify missing neighbors. Example, https://codecov.io/gh/invenia/Impute.jl/src/2f64a27010480692aff4077792e92ce5b5c01bc0/src/imputors/knn.jlweights
from StatsBase.KDTree
from full dataset, but reduces points searched for to those observations containingmissings
.The end result is that the iris dataset tests aren't much better, but the new code allocates less memory, is faster and performs and order of magnitude better on the "Data match" tests that we expect it to do well on.
Before:
After:
Before:
After: