Open addramir opened 2 months ago
@DSuveges loves multiple testing problems...
@Daniel-Considine what is the threshold for eQTL catalogue?
I have no problem with lowering the p-value threshold, however setting one single value across the entire data lake souds quite drastic and hard to imagine the consequences. Espcecially given the curated datasets from GWAS Catalog. I would carefully benchmark of the effect. I can imagine, there would be entire diseases where we would lose any knowledge we had.
Let's assume there's a disease with only one GWAS Study that could identify 3 significant loci given the 5e-8 threshold. This knowledge comes with the standard pinch of salt being 5% chance these signals are false. (If I correctly intrepret the stats here)
After the p-value thresold adjustment, we can claim that the overall reliability of our full dataset improved, but we would no longer be able to tell anything about this disease. Not even with the 5% FDR. Does it worth?
I tend to symphatise with this action for traits/diseases where there are a bunch of studyies with hugely varying sample sizes, but such a systemic cut might drop a lot of rare stuff.
Let's assume there's a disease with only one GWAS Study that could identify 3 significant loci given the 5e-8 threshold. This knowledge comes with the standard pinch of salt being 5% chance these signals are false. (If I correctly intrepret the stats here)
The math is true if you have only one study. If you have more studies with the same p-value, your FDR will be much higher.
The proposed threshold doesn't really solve the problem of multiple testing - it is too liberal, however it probably can fix some problems from having high density genotyping panels and non-european ancestries. I agree that we need somehow to benchmark it and I will create a separate ticket for it.
It is time to discuss the genome-wide significant p-value threshold for GWAS and molQTL studies. We started this discussion with @Daniel-Considine and @d0choa.
Background
For now, we use the standard 5e-8 threshold for clumping and FM, but on the harmonic averaging stage we only use 1e-8. We are growing, and we also have non-Europeans. 5e-8 doesn't work anymore. It is difficult to estimate the effective number of independent tests. As a rule of thumb, however, we can simply assume that the new threshold should be at least one order of magnitude stricter than the previous one. If it was 1e-8, the new one should be 1e-9. Using the new threshold will reduce the computational efforts. @DSuveges what do you think?
For molQTLs I suggest to use the default study-wise p-value threshold. For example, UKBB-PPP uses 1.7e-11.
Tasks
Acceptance tests
Nice to check, how much less CSs we will have.