VarIr / scikit-hubness

A Python package for hubness analysis and high-dimensional data mining
BSD 3-Clause "New" or "Revised" License
44 stars 10 forks source link

ENH sparse multilabel classification #61

Closed VarIr closed 4 years ago

VarIr commented 4 years ago

Sparse indicator target matrices in kNN are converted to dense arrays, which can cause out-of-memory erros, when there are many classes, and is likely inefficient already for not-so-many classes.

This PR makes use of the indicator matrix sparsity, and parallelizes critical loops.

This also enables parallel ANN search, when no value is passed by the algorithm_param dict, but only via class n_jobs arguments. In addition, some tqdm calls are improved.

codecov[bot] commented 4 years ago

Codecov Report

Merging #61 into master will decrease coverage by 0.01%. The diff coverage is 96.96%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #61      +/-   ##
==========================================
- Coverage   99.15%   99.14%   -0.02%     
==========================================
  Files          55       57       +2     
  Lines        4512     4560      +48     
  Branches      499      501       +2     
==========================================
+ Hits         4474     4521      +47     
- Misses         19       21       +2     
+ Partials       19       18       -1     
Impacted Files Coverage Δ
skhubness/neighbors/base.py 96.49% <85.71%> (-0.37%) :arrow_down:
skhubness/__init__.py 100.00% <100.00%> (ø)
skhubness/analysis/estimation.py 99.57% <100.00%> (-0.01%) :arrow_down:
skhubness/neighbors/classification.py 100.00% <100.00%> (ø)
skhubness/neighbors/tests/test_classification.py 100.00% <100.00%> (ø)
skhubness/utils/multiprocessing.py 100.00% <100.00%> (ø)
skhubness/neighbors/tests/test_neighbors.py 99.89% <0.00%> (+0.10%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 31ceacd...64b7e29. Read the comment docs.