Interpretation of the results applying LEfSe

bfalco commented 2 years ago

Hello Chi,

Thank you very much for your kind response by email. I attach the microtable (zip file) and the lines of code that you requested to solve my question.

LEfSe.zip

set.seed(123) t1 <- trans_diff$new(dataset = dataset, width = 0.3, method = "lefse", group = "Group", taxa_level = "all", alpha = 0.05) t1$plot_diff_bar(threshold = 4) t1$plot_diff_abund(use_number = 1:10, group_order = c("A", "B"))

As you can observe, LEfSe analysis offers very significant results for many taxa even when the abundance is very similar between the 2 groups that I compare. In fact, the comparison between the 2 groups with conventional tests (t.test and wilcox) does not offer significant results in any of the taxa.

My question is, why does LEfSe offer such significant results if the differences between the taxa are very small or null?

Thank you very much for the help

ChiLiubio commented 2 years ago

Hi @bfalco You are correct! Thank you very much for this finding. This is a bug in wilcox test when groups = 2. Sorry for that. The original version of lefse in this class first take all the differential test via Kruskal-Wallis Rank Sum Test. Later I add wilcox test in lefse for groups = 2 instead of KW test (although similar in terms of the results). But I forgot the usages of their functions have a little difference between KW and wilcox, leading to a weird result as you find. I have updated the package in the github. You can reinstall it from github. But I find now the results have no significance after p value adjustment. So maybe you can consider other methods. The new version will be released in CRAN in the coming days. Please feel free to tell me if you have other suggestions. Thanks again for your careful comparision.

Best Chi

bfalco commented 2 years ago

Chi, thank you very much for your help. I have checked the results with the update and there is no significant value.

Normally, I check the statistical tests with different R packages or software to make sure everything is correct. From the beginning, I have uploaded my data to Galaxy Server to compare the two groups (A and B) and I have obtained different results. Neither before nor now do the microeco and Galaxy Server outputs match for the LEfSe.

If there are no taxa that differ when comparing their abundances by group (p < .05) with microeco, why does the Galaxy Server indicate that there are significant taxa?

I send you the file so you can check it:

LEfSe.txt

Again, I thank you very much for your dedication.

ChiLiubio commented 2 years ago

Hi @bfalco Since I can not open the Galaxy Server now, I guess there have at least two points related with the difference after I look over your LEfSe.txt. First, the function in microeco can filter all taxa that have unidentified information, i.e. the taxa names end with "__", such as "something|g__". Those features have nothing to do with the meaningful results. Second, the function automatically use p value adjust method. I remember there is no p value adjustment in the LEfSe python version (same as Galaxy). You can find there are 33 taxa found significant in your data. But they have significance around the threshold. So after p value adjustment, no taxa found significant. This is normal phenomenon in statistics. So I am considering adding a parameter that can be used to close the p adjustment for users. How do you think so? Thanks for your good question.

Best Chi

bfalco commented 2 years ago

Yes, the filtering of taxa without taxonomic information (e.g., …|g__) and the option to adjust the p parameter from microeco is wonderful and they are advantages that, if I am not mistaken, Galaxy does not offer. In fact, it is also possible to get the results with no adjustment method (p_adjust_method = 'none') from microeco. I don't know if your suggestion to add a parameter to close the fit of p meant not getting any fit of p.

In any case, even if I remove the missing information from all taxa in the file and see the significant taxa without the fit method from microeco, the results obtained with Galaxy do not match those obtained with microeco.

If you cannot verify the results from Galaxy with my data but you have other data, either real or fictitious, that offer the same results with both microeco and Galaxy, I would appreciate it if you could show them to me to check if I am running the test with different parameters.

Thank you so much for everything.

ChiLiubio commented 2 years ago

Hi @bfalco I successfully run Galaxy server lefse with your data. I also updated github microeco with specified usage of parameter p_adjustmentmethod = NULL (without adjustment). I generate a new input file for Galaxy for the comparision by deleting all the taxa which end with "_\" or "uncultured" or "sp" to as much as possible make the methods comparable. All files are tab-seperated. Now the input file for Galaxy is: lefse_test_clean.txt The Galaxy result is： Galaxy_lefse_result.txt I filtered the taxa with no significance (p < 0.05). Then 30 are significant and LDA > 2 (default). The I performed lefse in microeco with: t1 <- trans_diff$new(dataset = dataset, method = "lefse", group = "Group", taxa_level = "all", alpha = 0.05, p_adjust_method = NULL) The result is : microeco_lefse_result_wilcox.txt There are some differences in p values between Galaxy and microeco as the default differential test method for total two groups in microeco is wilcox in microeco. Actually, it is KW in the original version. I move to wilcox in the later versions. Then I updated microeco with the differential test method KW same with Galaxy and rerun the lefse in microeco: t2 <- trans_diff$new(dataset = dataset, method = "lefse", group = "Group", taxa_level = "all", alpha = 0.05, p_adjust_method = NULL) The result is: microeco_lefse_result_kruskal.txt Then the test results are same. The LDA scores have a little difference between two tools, but very similar in the trend. I think such minor difference can be ignored in terms of the weight becasue more other operations can have stronger influcences than this. I plan to come back to KW to make them comparable for others to reduce the confusion. How do you think so? Thanks very much for your suggestion.

Best Chi

ChiLiubio commented 2 years ago

The v0.9.0 has been released on CRAN. The bug has been fixed. Thanks very much for your finding and suggestion!

bfalco commented 2 years ago

Hi Chi,

First of all, sorry for not replying earlier.

I have gone through all the files you sent me to verify that the same results can be obtained using microeco and Galaxy, and, although it is not exact data, I am satisfied with the similarity and that I have seen the process.

Knowing that many researchers have used Galaxy to apply LEfSe, my recommendation is that your microeco package should be able to offer the same outputs as much as possible. In this sense, before we say goodbye, I would like to know why the LEfSe outputs from microeco can vary if a seed is not established.

Thanks for the updates ;-)

ChiLiubio commented 2 years ago

Thanks. I will check how this difference is generated. This will be a tricky and slow work.

bfalco commented 2 years ago

Okay, I get it. I guess that randomness that occurs when applying LEfSe and is seen in the LDA scores, also explains why the MANOVA test gives slightly different p-values each time the test is run.

ChiLiubio commented 2 years ago

Excellent! This is a key point.

ChiLiubio / microeco

Interpretation of the results applying LEfSe #106