yijuanhu / LOCOM-Archive

A logistic regression model for testing differential abundance in compositional microbiome data
12 stars 2 forks source link

Interpreting the summary table output by LOCOM #6

Open brandon-kieft opened 1 year ago

brandon-kieft commented 1 year ago

Hello,

I was able to get LOCOM working on my 16S SSU count table and have created a "res" object based on one of my binary metadata variables:

res <- locom(otu.table = data.matrix(locom_counts_data), Y = as.numeric(factor(locom_metadata$Variable)), seed = 555, adjustment = "BH", n.cores = 1, prev.cut = 0, n.perm.max = 5000, Firth.thresh = 1)

My question is how to now interpret the effect sizes from the results table (see attachment). Which of my binary variables (e.g., Control vs Treatment) is related to the positive or negative effect size from the table? In other words, how do I know which OTUs are associated with the Control vs. Treatment?

locom_summary_table.txt

Thanks! Brandon

yijuanhu commented 1 year ago

Hi Brandon,

The effect sizes are coefficients for the case-control status in the logistic regression of the otu.table on the case-control status. So, a positive effect size means more abundance of the taxon in cases (if cases are coded as 1s and controls are coded as 0s).

Best, Yijuan

On Nov 23, 2022, at 6:07 PM, brandon-kieft @.**@.>> wrote:

Hello,

I was able to get LOCOM working on my 16S SSU count table and have created a "res" object based on one of my binary metadata variables:

res <- locom(otu.table = data.matrix(locom_counts_data), Y = as.numeric(factor(locom_metadata$Variable)), seed = 555, adjustment = "BH", n.cores = 1, prev.cut = 0, n.perm.max = 5000, Firth.thresh = 1)

My question is how to now interpret the effect sizes from the results table (see attachment). Which of my binary variables (e.g., Control vs Treatment) is related to the positive or negative effect size from the table? In other words, how do I know which OTUs are associated with the Control vs. Treatment?

locom_summary_table.txthttps://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fyijuanhu%2FLOCOM%2Ffiles%2F10079640%2Flocom_summary_table.txt&data=05%7C01%7Cyijuan.hu%40emory.edu%7Cf4e32b33f1a34ae181f108dacda77468%7Ce004fb9cb0a4424fbcd0322606d5df38%7C0%7C0%7C638048416338388717%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Kj4w4UEyD21M4oYhZ%2Fdyxs4cvuFOYwwXI5%2FOe1983wQ%3D&reserved=0

Thanks! Brandon

— Reply to this email directly, view it on GitHubhttps://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fyijuanhu%2FLOCOM%2Fissues%2F6&data=05%7C01%7Cyijuan.hu%40emory.edu%7Cf4e32b33f1a34ae181f108dacda77468%7Ce004fb9cb0a4424fbcd0322606d5df38%7C0%7C0%7C638048416338388717%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Q2WNKtvpRwXbJl%2BdlNPFLqnzVHdUgoI0yzRgnRtMW1E%3D&reserved=0, or unsubscribehttps://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAFM4UJIGIC5CWFPXFHBDEVDWJ2PR7ANCNFSM6AAAAAASJOWEFE&data=05%7C01%7Cyijuan.hu%40emory.edu%7Cf4e32b33f1a34ae181f108dacda77468%7Ce004fb9cb0a4424fbcd0322606d5df38%7C0%7C0%7C638048416338388717%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=8lU2zScbckU2zSw4ZtfACDkeNvbjWVXWWXKFqiXpRMY%3D&reserved=0. You are receiving this because you are subscribed to this thread.Message ID: @.***>

xnysound commented 1 year ago

Hi Yijuan,

I am new to LOCOM and have a few questions:

  1. Running the following code showed: res <- locom(otu.table = filtered_otu, Y = Y, C = C[, 1], fdr.nominal = 0.2, seed = 1, adjustment = "BH", n.cores = 4, prev.cut = 0, n.perm.max = 10000, Firth.thresh = 0)
  1. After running the following code, I saw the otu.name showed the full name of the taxa and the otu.tax showed NA. Considering I didn't use otu number as the otu.name, I think this is okay right?

summary.table <- data.frame(otu.name = colnames(res$p.otu)[o], mean.freq = colMeans(otu.table.filter/rowSums(otu.table.filter))[o], prop.presence = prop.presence[o], p.value = signif(res$p.otu[o], 3), q.value = signif(res$q.otu[o], 3), effect.size = signif(res$effect.size[o], 3), otu.tax = throat.otu.taxonomy[as.numeric(colnames(res$p.otu)[o]) + 1], row.names = NULL)

Many thanks!

Best, Xinyi

YINGTIAN-HU commented 1 year ago

Hi Yijuan,

I am new to LOCOM and have a few questions:

  1. Running the following code showed: res <- locom(otu.table = filtered_otu, Y = Y, C = C[, 1], fdr.nominal = 0.2, seed = 1, adjustment = "BH", n.cores = 4, prev.cut = 0, n.perm.max = 10000, Firth.thresh = 0)
  • Does this mean the permutation stopped at 2001 due to exceeding the minimum number of rejections? permutations: 1 permutations: 1001 permutations: 2001
  • What does "14 OTU(s) with fewer than 65.8 in all samples are removed" mean after running this?
  • I am trying to use LOCOM to validate species with differential abundance after using AMCOMBC. I have used the same filtered phyloseq for both methods. To keep the approach consistent, would you suggest to set "Firth.thresh = 0"?
  1. After running the following code, I saw the otu.name showed the full name of the taxa and the otu.tax showed NA. Considering I didn't use otu number as the otu.name, I think this is okay right?

summary.table <- data.frame(otu.name = colnames(res$p.otu)[o], mean.freq = colMeans(otu.table.filter/rowSums(otu.table.filter))[o], prop.presence = prop.presence[o], p.value = signif(res$p.otu[o], 3), q.value = signif(res$q.otu[o], 3), effect.size = signif(res$effect.size[o], 3), otu.tax = throat.otu.taxonomy[as.numeric(colnames(res$p.otu)[o]) + 1], row.names = NULL)

Many thanks!

Best, Xinyi

Hi @xnysound,

Thanks for your questions.

For your first question, it means that there are 14 OTUs with fewer than 20% presence (here the number is 65.8 because I guess your sample size is 329) are filtered out. If you do not want this additional filtering, you may set prev.cut = 0. As for Firth.thresh, I would suggest using default 0.4 to help avoid the possible not-converge issue.

For your second question, it should be okay. You can also use otu number as otu name or specify otu name yourself at the beginning of the analysis to see whether this issue disappears.

Thanks, Yingtian

shibataryohei commented 1 year ago

Hi Yijuan,

Let me ask you the detail of "effect.size".

In a conventional logistic regression model using a continuous variable as "exposure", the coefficient represents the odds ratio estimated for every 1 increase in the exposure. For example, when the coefficient is 1.4, we can interpret the model as an increase of 1 mg/L dose of a drug is associated with an increase of 1.4 odds ratio.

Then, in LOCOM, what does "effect.size" represents?

Thanks, Ryohei

yijuanhu commented 1 year ago

In LOCOM, the trait of interest is treated as the exposure, whether the microbiome reads fall into the taxon of interest are treated as the binary outcome. The exp(effect size) is the odds ratio of a read falls into the taxon of interest instead of falling into a null taxon when the exposure increased by 1 unit.

Hope this is helpful. I know this sounds a little awkward.

Yijuan

Sent from my iPhone

On Mar 20, 2023, at 10:00 AM, Ryohei SHIBATA @.***> wrote:



Hi Yijuan,

Let me ask you the detail of "effect.size".

In a conventional logistic regression model using a continuous variable as "exposure", the coefficient represents the odds ratio estimated for every 1 increase in the exposure. For example, when the coefficient is 1.4, we can interpret the model as an increase of 1 mg/L dose of a drug is associated with an increase of 1.4 odds ratio.

Then, in LOCOM, what does "effect.size" represents?

Thanks, Ryohei

— Reply to this email directly, view it on GitHubhttps://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fyijuanhu%2FLOCOM%2Fissues%2F6%23issuecomment-1476392165&data=05%7C01%7Cyijuan.hu%40emory.edu%7Cdf37da00a8c542abccd508db2953e2b1%7Ce004fb9cb0a4424fbcd0322606d5df38%7C0%7C0%7C638149212487292498%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=jUJPvE54YT0R%2FEtQ3NYdpVHyDIWVObxy7%2FWS7FXgZUU%3D&reserved=0, or unsubscribehttps://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAFM4UJLBN6SSKOA2A632CWTW5BWJ5ANCNFSM6AAAAAASJOWEFE&data=05%7C01%7Cyijuan.hu%40emory.edu%7Cdf37da00a8c542abccd508db2953e2b1%7Ce004fb9cb0a4424fbcd0322606d5df38%7C0%7C0%7C638149212487292498%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=6EhsphlZ04jxegVag3zQUM5w0YOXqy1k1ICtP%2FxzQOQ%3D&reserved=0. You are receiving this because you commented.Message ID: @.***>

xnysound commented 1 year ago

Hi Yijuan, Could you please clarify the difference between mean frequency and proportion presence?

Thank you!

Best, Xinyi

yijuanhu commented 1 year ago

Mean frequency is the mean relative abundance. Proportion of presence is the expected value of presence (read count > 0) that adjusts for the library size.

Yijuan

Sent from my iPhone

On Mar 20, 2023, at 7:54 PM, xnysound @.***> wrote:



Hi Yijuan, Could you please clarify the difference between mean frequency and proportion presence?

Thank you!

Best, Xinyi

— Reply to this email directly, view it on GitHubhttps://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fyijuanhu%2FLOCOM%2Fissues%2F6%23issuecomment-1477139150&data=05%7C01%7Cyijuan.hu%40emory.edu%7Ca2decafadb6040b89ce708db29a6d136%7Ce004fb9cb0a4424fbcd0322606d5df38%7C0%7C0%7C638149568687897921%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=CYQgpypaHct46nfefq6tJyTK0FQ7EvKxYfMv81vTjKw%3D&reserved=0, or unsubscribehttps://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAFM4UJJC5BSMMEVZPKK22JLW5D34BANCNFSM6AAAAAASJOWEFE&data=05%7C01%7Cyijuan.hu%40emory.edu%7Ca2decafadb6040b89ce708db29a6d136%7Ce004fb9cb0a4424fbcd0322606d5df38%7C0%7C0%7C638149568687897921%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=%2Bq0OIf%2BjOHY5tjM8XJk5i25hpowlS%2FqTl3hc%2BvxXvuQ%3D&reserved=0. You are receiving this because you commented.Message ID: @.***>

shibataryohei commented 1 year ago

Hi Yijuan,

I see. LOCOM is not for the model handling microbiome abundance as exposures and binary outcome (e.g., the occurrence of the trait of interest) as an outcome like a conventional logistic regression model!

Thanks, Ryohei

xnysound commented 1 year ago

Thank you, Yijuan. Could you please clarify the numerator and denominator of proportion presence?

yijuanhu commented 1 year ago

Hi,

We have now updated the term to be “probability of presence”, which is the expected presence of a taxon in a sample if all library sizes are rarefied to the same value. It is “expected” because we don’t perform any actual rarefaction of the library size but calculate the expected (average) probability of presence over all possible rarefactions.

Hope this is clear.

Yijuan

On Mar 23, 2023, at 5:29 PM, xnysound @.**@.>> wrote:

Thank you, Yijuan. Could you please clarify the numerator and denominator of proportion presence?

— Reply to this email directly, view it on GitHubhttps://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fyijuanhu%2FLOCOM%2Fissues%2F6%23issuecomment-1481938126&data=05%7C01%7Cyijuan.hu%40emory.edu%7C7b10fdc873b640781e5108db2be5a751%7Ce004fb9cb0a4424fbcd0322606d5df38%7C0%7C0%7C638152037584597827%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=NoTo2i9P6Y4vkd3TrpLeHqfBYHpYwDeZcsHZL75216I%3D&reserved=0, or unsubscribehttps://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAFM4UJKAHFHXSP3X7IOQOI3W5S6CVANCNFSM6AAAAAASJOWEFE&data=05%7C01%7Cyijuan.hu%40emory.edu%7C7b10fdc873b640781e5108db2be5a751%7Ce004fb9cb0a4424fbcd0322606d5df38%7C0%7C0%7C638152037584597827%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=BVbdB28aolo8UM8Jsz%2Fogn7jdGBL6OavH6CzDCH98CU%3D&reserved=0. You are receiving this because you commented.Message ID: @.***>