Open Rridley7 opened 1 month ago
Hi,
Yes, the framework is applicable to many types of data, though it has only been properly benchmarked with OTU counts + meta variables. Scalability should in principle be no issue, I've run it on tables as large as 1mio samples x 100k variables. One thing to consider: the high dimensionality of your data and relatively low sample count could lead to power issues when running with default parameters (in particular max_k=3
would have to be relaxed to 2 or even 1).
The main consideration would be normalization, i.e. different types of data may require different normalization methods. I suggest manually normalizing your tables as appropriate and specifying normalize=false
in learn_network
. FlashWeave already includes a (poorly documented) feature to provide several independently normalized tables that should be useful for this, e.g.: learn_network([<norm_omic1_data_path>, <norm_omic2_data_path>], meta_data_path; normalize=false, <kwargs...>)
Hello, thanks for your development of this tool! I was curious of your thoughts on using it on meta-omic datasets which comprise genetic information (e.g. metatranscriptomics, gene-level metagenomics)? My initial thoughts on this would be: