Closed Marius1311 closed 1 year ago
The adaptive thresholding only helps when computing the Schur decomposition, macrostates, etc. I reran the tutorial on the entire dataset and this part is fast (<= 1 minute for each). The problem is computing wk.compute_transition_matrix
which took over an hour to finish.
The eigengap is now after the fourth eigenvalue (before it was after the first, so this is better)
but the same macrostates are inferred as before.
Thanks a lot @WeilerP for running this, that's great! Good to see we get the same macrostates. You're right, the actual part that computes the transport maps does take a while, of course. Maybe we should pre-compute them and load them from file (I guess they can be cached in the kernel @michalk8 ? ) I think that would be the best solution actually, since we could showcase how to run CellRank on really large data.
Currently, our tutorial subsamples the data to 25% of the cells to speed up computations - rather than doing that, we should use our adaptive thresholding scheme, which is currently not used. Would be good to see whether this changes results - it's okay if it does actually, subsampling to 1/4 of the data is quite a reduction in cell number.