alevax / pyviper

Porting of Protein Activity and Pathway Inference to single cell and Python.
MIT License
10 stars 0 forks source link

Can we do meta_aREA without using a long data format? Is this possible? #14

Closed alexanderlewis99 closed 1 year ago

alexanderlewis99 commented 1 year ago

It would be nice if we can find a more efficient way to run meta_aREA without converting between short and long formats of data (i.e. melt) so that we can save computational time.

alexanderlewis99 commented 1 year ago

I looked into this and because we want know the proportion of regulators enriched per network for each sample, and different networks have different numbers of regulators, there is no clear way to line up each sample with multiple regulators from different networks without using a long format. If we tried examining the wide format (normal VIPER matrices), we would have matrices with different dimensions since networks have different regulators, so this approach is not feasible and the computational time with long formats is necessary.

alexanderlewis99 commented 1 year ago

This issue has been solved and addressed by my following commits. 1207470 8614ed8 3e3895c

The code runs much faster. One test dataset with 70K cells and 2500 genes that took 339 seconds now only takes 75 with metaAREA. We see similar improvement for meta_narnea as well.