sourmash-bio / sourmash_plugin_betterplot

Improved plotting/viz and cluster examination for sourmash
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

figure out sparse vs dense matrix stuff #28

Open ctb opened 1 month ago

ctb commented 1 month ago

As of v0.3.2, we now have the command pairwise_to_compare, which seems to work ok. (but, you know, tests needed! https://github.com/sourmash-bio/sourmash_plugin_betterplot/issues/24).

This is implemented in hokey code,

https://github.com/sourmash-bio/sourmash_plugin_betterplot/blob/59b870da8d4c9ad21f96fd0dfeb2e4daf0affd72/src/sourmash_plugin_betterplot.py#L423

and there is broken sparse-matrix loading code here, https://github.com/sourmash-bio/sourmash_plugin_betterplot/blob/59b870da8d4c9ad21f96fd0dfeb2e4daf0affd72/src/sourmash_plugin_betterplot.py#L557

and it feels like the mds2 command https://github.com/sourmash-bio/sourmash_plugin_betterplot/blob/59b870da8d4c9ad21f96fd0dfeb2e4daf0affd72/src/sourmash_plugin_betterplot.py#L489 should be able to operate entirely on sparse matrix representations,

but I find the scipy sparse matrix stuff https://docs.scipy.org/doc/scipy/reference/sparse.html difficult to interpret, and I keep on running into "this operation not allowed".

To the extent that I have any understanding at all, it's that we should be looking at COO arrays, maybe?

pls send help. kthxbye.