stjude / proteinpaint

Data visualization and analysis framework focused on phenotype-molecular data integration at cohort level.
https://proteinpaint.stjude.org/
Other
13 stars 3 forks source link

fix: Speed up top variably expressed genes query from gdc api, direct… #1775

Closed xzhou82 closed 5 days ago

xzhou82 commented 1 week ago

…ly submit case filter and do not first retrieve list of cases

Description

querying top variably expressed genes from gdc sees a speed gain. there had been a step to first query list of cases passing filter (takes 10 seconds, especially bad when there's no cohort), then pass these cases to /gene_selection/ api to get top genes. now the case querying step is eliminated. the cohort is directly passed to api (didn't know that before)

for exp clustering, will still retrieve list of cases from cohort as before, since the app must restrict to 1K cases otherwise it breaks. no speed gain there

all tested to work: adhoc and integrated views, non-ci test, adding gene exp row to matrix

Checklist

Check each task that has been performed or verified to be not applicable.

xzhou82 commented 5 days ago

thanks for testing. i haven't run into the case id issue, and adding gene exp to oncomatrix works. could you delete the cache at ~/data/cache/extApiResponse/ and recache to test again?

congyu-lu commented 5 days ago

thanks for testing. i haven't run into the case id issue, and adding gene exp to oncomatrix works. could you delete the cache at ~/data/cache/extApiResponse/ and recache to test again?

Thanks, no error now and did see speed gain