biocore / songbird

Vanilla regression methods for microbiome differential abundance analysis
BSD 3-Clause "New" or "Revised" License
54 stars 25 forks source link

About filtering the input biom file #145

Open coralzhang opened 3 years ago

coralzhang commented 3 years ago

Hi, I have a question about the input biom file. Regarding the interpretation of the results, does it make more sense to include the OTU table that includes all the taxa, or include only the taxa at the same levels, say only taxa at the family or genus level? Another way to ask this question, if I input a biom file with or without the taxonomy of the otus, does it make a difference? Will songbird look at the taxonomy? Many thanks!

mortonjt commented 3 years ago

No, songbird will not look at the taxonomy. If you want to collapse taxa, you'll want to do it yourself with the qiime taxa collapse command. https://docs.qiime2.org/2020.8/plugins/available/taxa/collapse/

On Tue, Nov 24, 2020 at 7:11 PM coralzhang notifications@github.com wrote:

Hi, I have a question about the input biom file. Regarding the interpretation of the results, does it make more sense to include the OTU table that includes all the taxa, or include only the taxa at the same levels, say only taxa at the family or genus level? Another way to ask this question, if I input a biom file with or without the taxonomy of the otus, does it make a difference? Will songbird look at the taxonomy? Many thanks!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/biocore/songbird/issues/145, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA75VXOJUMOEVBBBSD66OQDSRRRTPANCNFSM4UBWLWFA .

coralzhang commented 3 years ago

So can I interpret your answer as- If I were to use the songbird standalone, then would you recommend inputting the biom file with OTU table at one selected level, say genus level, because otherwise, having taxa of different levels, say both the genus and family levels in the OTU table is violating the multinomial assumption, technically? Correct?

mortonjt commented 3 years ago

I wouldn't know how to use taxa at different levels, so I can't comment on this. Unless you are talking about having a custom script to aggregate taxa on the phylogenetic tree. If that is the case, that is totally fine, but songbird can't do only any of that - you'd need to feed in preprocessed biom tables yourself.

On Tue, Nov 24, 2020 at 8:37 PM coralzhang notifications@github.com wrote:

So can I interpret your answer as- If I were to use the songbird standalone, then would you recommend inputting the biom file with OTU table at one selected level, say genus level, because otherwise, having taxa of different levels, say both the genus and family levels in the OTU table is violating the multinomial assumption, technically? Correct?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/biocore/songbird/issues/145#issuecomment-733442449, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA75VXLMA4PBYREGQVMJNZLSRR3XLANCNFSM4UBWLWFA .

coralzhang commented 3 years ago

Thanks for the response and sorry about my ill-posed question. Suppose I have a biom file with a OTU table that contains counts for 2 families and each family has 4 genera. Would you suggest me to a) input the biom file with only 2 lines of the counts for the families only (or 8 lines of counts for the genera) or b) input a biom file with 10 lines of counts for families and genera together?

mortonjt commented 3 years ago

I'm having trouble understanding this question - it would help if you provided a snapshot of the table you are trying to analyze.

Also, this doesn't seem to be a songbird question - I would recommend to follow up on the qiime2 forums, since the community there would be in a better position to answer this type of question.

On Wed, Nov 25, 2020 at 11:53 AM coralzhang notifications@github.com wrote:

Thanks for the response and sorry about my ill-posed question. Suppose I have a biom file with a OTU table that contains counts for 2 families and each family has 4 genera. Would you suggest me to a) input the biom file with only 2 lines of the counts for the families only (or 8 lines of counts for the genera) or b) input a biom file with 10 lines of counts for families and genera together?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/biocore/songbird/issues/145#issuecomment-733891537, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA75VXOIHRIGFDG4ZAC76O3SRVHD5ANCNFSM4UBWLWFA .