Closed Buraah closed 1 month ago
Hello @SvetlanaUP , can I work on this instead? Thank you.
Thank you @SvetlanaUP
Hello @SvetlanaUP, This curation is ready for review.
Thank you.
@Buraah could you please review this curation done by @Scholarpat? Thanks!
Hi @Scholarpat, thank you for curating this study.
This study has several experiments but only 1 has differential abundance signatures, which you have curated. It is a well-done curation for me.
My only concern is with the data transformation. The study identified differential taxa using DESeq2, and I know the transformation for this test is usually raw count. But the study didn't mention it anywhere. However, there are several mentions of relative abundance, eg: "Before diversity comparisons, the operational taxonomic unit (OTU) counts were normalized by a total sum (% relative abundance) followed by square-root transformation." (First paragraph Under Analysis of microbial communities section)
So for data transformation, I'm tending more toward "Relative Abundance" but I stand to be corrected.
Hello @Buraah,
Thank you so much for your review and feedback.
Regarding the data transformation, I initially opted for relative abundance. However, the OTU counts mentioned in the excerpt you quoted made me switch to raw counts. I am still somewhat uncertain about this decision though...
Great work @Scholarpat and @Buraah!
I remember we discussed this; here it is: https://community-bioc.slack.com/archives/C04RATV9VCY/p1697145085673689?thread_ts=1697141457.022309&cid=C04RATV9VCY
Chloe nicely explained: Data transformations are often dependent on the statistical test. This can be difficult to figure out so I recommend asking questions if you're not sure but generally speaking: Raw counts -> poisson, negative binomial, linear models, DeSeq2 Relative abundances -> This is most common. Mann Whitney U, Kruskall Wallis, LeFSe, many others Centered log ratio -> Rare. ANCOM Arcsine square-root -> Rare. MaAsLin2 sometimes. Some linear models rarely.
I will note that this is wrong or missing for many previously curated papers and a good cleanup task for an intrepid soul would be to try to update all of these. I've found a lot of DESeq2 papers that say they use relative abundances or CLR which do not make sense--**DESeq2 uses a negative binomial model which requires counts** or else it will not converge. Or variables that are whole numbers that approximate a negative binomial/poisson distribution.
https://bugsigdb.org/Study_1082 reviewed.
Thank you @SvetlanaUP . I've noted this for future curations.
Well done @Scholarpat! Thank you @SvetlanaUP I think the cleanup will be a great task to take up too. I'll consider doing this after the curation I'm currently working on.
Hi @SvetlanaUP Good morning. While I wait for the author's response on Study 1083 I want to start the cleanup task we talked about here.
I also noticed some studies curated earlier have no input for data transformation and would like to fill them as I go. For example these two: Study 2 and Study 3
Thank you.
@Buraah please do, THANKS!
The interplay between PCOS pathology and diet on gut microbiota in a mouse model Link: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9450977/