saezlab / liana

LIANA: a LIgand-receptor ANalysis frAmework
https://saezlab.github.io/liana/
GNU General Public License v3.0
176 stars 32 forks source link

Run time and memory requirements #10

Closed paulyashna closed 3 years ago

paulyashna commented 3 years ago

Hi,

Is it possible for you to comment on the run times of the tool at different number of cells? e.g. 5k? 10K etc. I have been trying with >10k cells and I think it would be a good option to downsample the seurat object. Has the consistency of results been checked at different downsampling thresholds?

Also, I would like to know run times and requirement of processing cores, if any when more than one method is being used at a time. If I run cellchat with 5k cells it is pretty fast. But as soon as I include another method the process never stops. How much time should this take approximately? A verbose=TRUE option would be nice to track the process. Greatly appreciate your thoughts on this.

Thanks, Yashna

deeenes commented 3 years ago

A verbose=TRUE option would be nice to track the process.

@dbdimitrov For this I can recommend logger: https://daroczig.github.io/logger/ We use it in OmnipathR: https://github.com/saezlab/OmnipathR/blob/e50affdeeab2f8b85b54b6d841d6a7a694eda7ab/R/options.R#L377

dbdimitrov commented 3 years ago

Hi Yashna,

We actually checked how consistent each method is when we downsample the cells and saw that most are very consistent, but this is very much expected as the mean gene expression by cluster did not change much as we downsampled. The only differences (AUROC ~0.8) were in the permutation-based methods (i.e. Squidpy and CellChat), but it did not seem too meaningful given how different the methods are in this regard.

Also agreed, I will aim to add a verbose=TRUE option soon.

I think I might have an idea what's behind this issue that you're observing. I noticed that Connectome (when called via LIANA) will hang until eternity for some data sets, but not when called independantly (i.e. via the call_connectome function). Honestly, I could not find a reason why this happens, only pinpointed it to the scaleData function of Seurat when executed via exec or do.call. There seems to be very little information about this.

For now, I would recommend excluding Connectome from the list of methods if this occurs for your dataset. I will reimplement the method soon and hopefully the issue will be resolved.

Regards, Daniel

paulyashna commented 3 years ago

Thanks Daniel, for your comments and quick reply. This is certainly helpful. Best Regards, Yashna

dbdimitrov commented 3 years ago

Hi Yashna,

Just wanted to mention that the issue with Connectome should now be resolved.

Regards, Daniel

paulyashna commented 3 years ago

Hi Daniel, Thanks for this, I shall check this out. Will you have an idea why I keep getting an error 'cell label should not contain 0' error with any of the methods I try. I understand the metadata information is directly taken from the seurat object. I have added cluster annotations under the column seurat_annotations in the metadata, and still get this error. I appreciate any help on this.