chanzuckerberg / cellxgene

An interactive explorer for single-cell transcriptomics data
https://chanzuckerberg.github.io/cellxgene/
MIT License
634 stars 119 forks source link

Gene sets load slowly for large sets / large datasets #2289

Open ambrosejcarr opened 3 years ago

ambrosejcarr commented 3 years ago

The performance issue On the Azimuth dataset, which has approximately 1 million cells and 77,000 genes, a gene set of 194 genes takes 105s to fully load.

To Reproduce Steps to reproduce the behavior:

  1. Download the azimuth dataset
  2. Download the hallmark.csv file containing hallmark gene sets in cellxgene format.
  3. cellxgene launch local.h5ad --gene-sets-file hallmark.csv

Version (please complete the following information):

maniarathi commented 3 years ago

I believe this isn't necessarily a regression. If you load the Azimuth dataset without gene sets, it takes equally as long because the dataset is so large. In contrast, if you did cellxgene launch local.h5ad --gene-sets-file hallmark.csv --backed then the time to load is quite quick. Of course then the slowness is passed on to other parts.

@bkmartinjr am I roughly on target with the assessment here? If so, my question to @signechambers1 would be whether to invest in this for 1.0 or not.

bkmartinjr commented 3 years ago

@bkmartinjr am I roughly on target with the assessment here? If so, my question to @signechambers1 would be whether to invest in this for 1.0 or not.

I don't know - would need to explore a bit to give you an idea. Historically, we almost always have regressions (initially) when making changes like this, so it isn't all that unlikely. LMK if you want me to investigate.

Any fixes here would benefit all deployments. And it has been a long time since I did a performance sweep of the component rendering (ie, there is likely some wins available).

maniarathi commented 3 years ago

OK I think I may have misunderstood -- @ambrosejcarr what did you mean by load? As in the time it takes before localhost:5000 is available or the time it take to render the UI once you land on localhost:5000?

maniarathi commented 3 years ago

My bad, chatted offline with Ambrose. This is the time it takes for all the genes to load when you open a large gene set.

ambrosejcarr commented 3 years ago

Arathi is correct - the time from unfurling a large gene set to the time when the last gene in the gene set loads. Sorry my initial description wasn't clear!

On Wed, Jul 14, 2021 at 6:33 PM maniarathi @.***> wrote:

My bad, chatted offline with Ambrose. This is the time it takes for all the genes to load when you open a large gene set.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/chanzuckerberg/cellxgene/issues/2289#issuecomment-880253158, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABH7C4D4IEVS2Q3KHG6GDGLTXYGCZANCNFSM5AIBT44Q .

signechambers1 commented 3 years ago

Removing from Desktop 1.0 epic since we made more granular tickets for the fixes that are in for Desktop 1.0. Keeping open for tracking purposes and will reevaluate after testing.