meringlab / FlashWeave.jl

Inference of microbial interaction networks from large-scale heterogeneous abundance data
Other
70 stars 8 forks source link

Using metadata with normalize = false fails #24

Closed markobudinich closed 3 years ago

markobudinich commented 3 years ago

Hi! I'm trying to run a custom CLR normalization on my data and then trying to use FlashWeave to infer a network. It works, except that when I try to add the metadata it fails. I was able to reproduce the issue using the test data in the repo:

using FlashWeave
data_global="test/data/HMP_SRA_gut/HMP_SRA_gut_tiny.tsv"
meta_path = "test/data/HMP_SRA_gut/HMP_SRA_gut_tiny_meta.tsv"

net_res_global = learn_network(data_global, meta_path, n_obs_min=3,normalize=true) #works
net_res_global = learn_network(data_global, meta_path, n_obs_min=3,normalize=false) #fails

Any ideas on the issue?

jtackm commented 3 years ago

Hi Marko, Thanks for the feedback. The normalize flag was originally thought to be used with an already pre-loaded and properly formatted input matrix (see the function learn_network(data; ...)), not so much for input paths, since these may contain heterogeneous data types that need to be homogenized during normalization and pre-processing.

However, since normalize is accidentally propagated we may as well support this use case. I just made a patch that forces data type homogenization if normalize=false, could you give it a try via ] add FlashWeave#master?

markobudinich commented 3 years ago

Thanks for the quick answer Janko. I tried the patch on my example above and it works fine, however, is not working for my particular case. With your explanation, I think is best to close the issue and go through the learn_network(data; ...) solution. Thanks again!

jtackm commented 3 years ago

Okay. Custom normalization can be tricky in FlashWeave (different modes have different data requirements, hence the warning message), so let me know if you stumble on anything else.