meringlab / FlashWeave.jl

Inference of microbial interaction networks from large-scale heterogeneous abundance data
Other
70 stars 8 forks source link

Use of FlashWeave on meta-omic protein data? #40

Open Rridley7 opened 1 month ago

Rridley7 commented 1 month ago

Hello, thanks for your development of this tool! I was curious of your thoughts on using it on meta-omic datasets which comprise genetic information (e.g. metatranscriptomics, gene-level metagenomics)? My initial thoughts on this would be:

jtackm commented 4 weeks ago

Hi,

Yes, the framework is applicable to many types of data, though it has only been properly benchmarked with OTU counts + meta variables. Scalability should in principle be no issue, I've run it on tables as large as 1mio samples x 100k variables. One thing to consider: the high dimensionality of your data and relatively low sample count could lead to power issues when running with default parameters (in particular max_k=3 would have to be relaxed to 2 or even 1).

The main consideration would be normalization, i.e. different types of data may require different normalization methods. I suggest manually normalizing your tables as appropriate and specifying normalize=false in learn_network. FlashWeave already includes a (poorly documented) feature to provide several independently normalized tables that should be useful for this, e.g.: learn_network([<norm_omic1_data_path>, <norm_omic2_data_path>], meta_data_path; normalize=false, <kwargs...>)