hsmaan / balanced-clustering

Reworked clustering metrics for assessing performance in imbalanced settings
MIT License
14 stars 3 forks source link

Emi using numba and package cleanup #6

Closed adamgayoso closed 2 years ago

adamgayoso commented 2 years ago
hsmaan commented 2 years ago

Currently tests are failing because of numba's incompatibility with the scipy.sparse._csr.csr_matrix() class. This will have to be fixed upstream of the emi call (unless there is a way to fix this compatibility issue, but I couldn't find an obvious solution).

adamgayoso commented 2 years ago

Ah right didn't catch that, looks like the a and b terms need to be computed outside of numba (won't be a real slow down as scipy sum is quite optimized). What you can do is create a private function in this file with the numba code and call that function from the emi function (that computes a and b and the shape etc etc)

adamgayoso commented 2 years ago

Everything should work now, and I cleaned up some other random files and added a proper gitignore

hsmaan commented 2 years ago

Perfect, thanks for all the refactoring. All checks/tests passed, so going to go ahead and merge.

I did want to compare numerical outputs between the cython and numba implementations and their stability, but I can do this in a branch off of scib_metrics and let you know if there's any issues.