single-cell-genetics / limix_qtl

Apache License 2.0
18 stars 13 forks source link

random effect file #38

Open perl-xxp opened 3 weeks ago

perl-xxp commented 3 weeks ago

Dear Marc Jan Bonder,

I am following your paper published in Genome Biology (2021) to perform cell-type specific sc-eQTL mapping. In the paper, you mentioned a model including two random effects (kinship and 1/ncells or 1/nreads), and using a weighting factor w. I'm a bit confused about how to implement this functionality using the limix_qtl software. Does it involve adding two files in the random effect argument?

Additionally, regarding the multiple testing correction process, could you provide more explanation on how to implement it using the limix_qtl software? I've seen some scripts in the software, but I'm still uncertain about the exact procedure.

Thank you for your time and assistance.

Best regards

Bonder-MJ commented 6 days ago

Hello,

Thanks for your question.

You can indeed input two random effect files when running, the first one is expected to be the genotype kinship (or something similar, using genotype IDs), the second one can be your random effect on the phenotype (using phenotype ids). You can feed them in using the same flag and just seperate the files by comma.

I hope that helps.

What would you like to know regarding multiple test? You can convert the hdf5 files using the postprocessing scripts, and leverage any further multiple testing you would like using your favorite programming language.

Best, Marc

perl-xxp commented 19 hours ago

Thanks for your reply. So for multiple test results, is the "empirical_feature_p_value" column the first round corrected P value (gene level across SNPs)? I have tested one cell type using the BH method for the empirical_feature_p_value of "top_qtl_results_all.txt" and only got 30 eGenes (FDR < 0.05). Is it normal to have such a low number of eGenes? My population is approximately ~120 persons, ~1 million cells, and I used sex, age, height, weight, and 30 expression PCs as covariates, as well as kinship and 1/n cells as the second random effects. For the software, the parameters "-c -gm gaussnorm -rc -lrc" were set. I am running other cell types to check the eGene number. I have checked my genotype and expression data several times and didn't find any major mistakes. Do you have any suggestions?

Thanks Xiaopeng

Bonder-MJ commented 10 hours ago

Hi Xiaopeng,

No that sounds like way to little for 120 people. Could you look at the original Pvalues before permutations that might help debugging. Do you see that they are actually showing many more effects?

Thanks, Marc

perl-xxp commented 7 minutes ago

Hi Marc,

Here is the screenshot of 'top_qtl_results_all.txt'. 1731463895355

And I found only the raw Pvalues reached <1e-7 level, the final FDR value can reach <0.05 What do you mean "actually showing many more effects?"

Thanks, Xiaopeng