martinjzhang / scDRS

Single-cell disease relevance score (scDRS)
https://martinjzhang.github.io/scDRS/
MIT License
98 stars 11 forks source link

The 'n_fdr_0.2' number of all group is 0 #84

Closed Y-Isaac closed 3 months ago

Y-Isaac commented 3 months ago

Hi, this is a nice tool, thanks for your effort!

I ran a downstream analysis on multiple traits at once, but I found that the number of _n_fdr0.2 in some phenotypes was 0 (in all groups), which left me somewhat puzzled. And I found that even if these all zero phenotypes are significant, they will not show Squares in subsequent plots. I would like to ask if this situation is normal and how should I explain it? The log is below:

Trait=Trait1, n_gene=810: 0/110824 FDR<0.1 cells, 0/110824 FDR<0.2 cells (sys_time=6478.5s)

Thank you in advance for your help! I am new in this field, perhaps my question is a bit foolish, please forgive me!

Best regards, Issac

Y-Isaac commented 3 months ago

oh, I see the similar question have been discussed before, but I have some extra questions I want to confirm:

1.As mentioned earlier, phenotypes that are all zeros will be blank when plotted, even if they are significant on certain cell groups. Is this something you intentionally designed? Because according to the instructions in the document, if a cell group is significant, it should be marked with a square

2.According to #64 , this situation may have occurred due to a lack of power in scDRS, but what I don't quite understand is that I ran seven phenotypes at once, but only one of them experienced this situation, and these phenotypes are actually related, so this result is a bit strange to me.

Thanks in advanced!

KangchengHou commented 3 months ago
  1. It looks like Trait1 does not have any significant cells. Can you clarify what you mean that "they are significant on certain cell groups"?

  2. The power of scDRS depends on factors such as the sample size of GWAS -- perhaps some phenotypes you ran have lower sample size / or lower heritability, causing them to lack power.

Y-Isaac commented 3 months ago
  1. It looks like Trait1 does not have any significant cells. Can you clarify what you mean that "they are significant on certain cell groups"?
  2. The power of scDRS depends on factors such as the sample size of GWAS -- perhaps some phenotypes you ran have lower sample size / or lower heritability, causing them to lack power.

Thank you very much for your assistance. I'll do my best to understand your meaning.

1.Yes, trait 1 does not exhibit significant cells across all cell types. However, at the same time, it is significant in some cell types (i.e., assoc_mcp < 0.05).

2.Based on my practice, this is unlikely due to insufficient sample size in GWAS, as I conducted scDRS analysis for all phenotypes simultaneously, and these seven phenotypes have similar sample sizes and heritabilities, but only trait 1 exhibits this phenomenon.

3.I also attempted analysis using human single-cell data (TabulaSapiens), and this situation improved; n_fdr_0.05 appears to be normal.

4.I would like to confirm if, in cases where assoc_mcp < 0.05 but n_fdr_0.05 is 0, can I consider this cell type significant in this phenotype?

I'm eager to receive your further assistance!

KangchengHou commented 3 months ago

Thanks for the clarification.

1.Yes, trait 1 does not exhibit significant cells across all cell types. However, at the same time, it is significant in some cell types (i.e., assoc_mcp < 0.05).

assoc_mcp corresponds to the nominal p-value, while the plotting function assign significance to cell-type after multiple testing correction

2.Based on my practice, this is unlikely due to insufficient sample size in GWAS, as I conducted scDRS analysis for all phenotypes simultaneously, and these seven phenotypes have similar sample sizes and heritabilities, but only trait 1 exhibits this phenomenon.

Other factors also influence number of significant cells, e.g., cells in your dataset are not enriched in corresponding GWAS trait gene sets.

4.I would like to confirm if, in cases where assoc_mcp < 0.05 but n_fdr_0.05 is 0, can I consider this cell type significant in this phenotype?

You can use assoc_mcp (a p-value) or assoc_mcz (a z-score), but you do need to correct for multiple testing across cell types and traits. For convenience, you can take the numbers from the output dataframe and perform the multiple testing correction (rather than use our plotting function) for more flexibility.

Y-Isaac commented 3 months ago

Thanks for the clarification.

1.Yes, trait 1 does not exhibit significant cells across all cell types. However, at the same time, it is significant in some cell types (i.e., assoc_mcp < 0.05).

assoc_mcp corresponds to the nominal p-value, while the plotting function assign significance to cell-type after multiple testing correction

2.Based on my practice, this is unlikely due to insufficient sample size in GWAS, as I conducted scDRS analysis for all phenotypes simultaneously, and these seven phenotypes have similar sample sizes and heritabilities, but only trait 1 exhibits this phenomenon.

Other factors also influence number of significant cells, e.g., cells in your dataset are not enriched in corresponding GWAS trait gene sets.

4.I would like to confirm if, in cases where assoc_mcp < 0.05 but n_fdr_0.05 is 0, can I consider this cell type significant in this phenotype?

You can use assoc_mcp (a p-value) or assoc_mcz (a z-score), but you do need to correct for multiple testing across cell types and traits. For convenience, you can take the numbers from the output dataframe and perform the multiple testing correction (rather than use our plotting function) for more flexibility.

Thanks for you help! I I think I understand what you mean. I would like to get your confirmation that if I input ten phenotypes at once, each containing ten cell types, will it calculate the P-value for a total of 100 cell-phenotype pairs using the Benjamini-Hochberg FDR correction, and then plot with the corrected q-values?

KangchengHou commented 3 months ago

assoc_mcp and assoc_mcz in the produced dataframe are at the nominal level (so you need to convert them to FDR by yourself).

But the plotting function would take cell-type and phenotype pairs (across 100 pairs in your example) and perform the FDR correction and determine significance based on FDR < 0.05

Y-Isaac commented 3 months ago

assoc_mcp and assoc_mcz in the produced dataframe are at the nominal level (so you need to convert them to FDR by yourself).

But the plotting function would take cell-type and phenotype pairs (across 100 pairs in your example) and perform the FDR correction and determine significance based on FDR < 0.05

I really appreciate you! Now I am going to close this issue.