Open sroyyors opened 3 years ago
hi sorry, to clarify the case where I saw the algorithm making some progress had 2k and 20k cells. But this had all the features. When I went to 5k and 50k, things have not been moving forward. Just wanted to clarify that dimensionality reduction of 5, and 50k did not help either.
Hi, For distance-based or kernel-based algorithm, it is difficult to be scalable to very large-scale datasets with cells up to ~10^6 because the computational complexity depends on the number of samples. We are still working on it. But I think 5k or 50k cells can be handled by UnionCom.
Here are some ideas:
hi, I am running unioncom on integrate two datasets one with 5k cells and one with 50k cells. I gave the full feature matrix and it was going slowly, so I decided to reduce the dimensionality to 10 and 15 for each. But now it is even slower and not really doing anything. I am wondering if you have an idea of the runtime and if you recommend doing any dimensionality reduction. I did NMF to reduce dimensionality.