Closed zhangpicb closed 1 week ago
These four variants are skipped on purpose in TensorQTL, since they are monomorphic for the 205 Brain_Cortex samples. If you use the --warn_monomorphic
flag you'll see this in the log. This unlikely has an effect on the significance of the gene. The small differences in pval_beta
and qval
that you're seeing between your run and V8 is likely due to a different set of permutations being used. Also make sure you're using --qvalue_lambda 0.85
to match V8 settings.
Thanks for your quick reply!
First,
bcftools filter -e "MAF < 0.01" GTEx_Analysis_2017-06-05_v8_WholeGenomeSeq_838Indiv_Analysis_Freeze.SHAPEIT2_phased.vcf.gz -Oz -o GTEx_Analysis_2017-06-05_v8_WholeGenomeSeq_838Indiv_Analysis_Freeze.SHAPEIT2_phased.MAF001.vcf.gz
tabix -p vcf GTEx_Analysis_2017-06-05_v8_WholeGenomeSeq_838Indiv_Analysis_Freeze.SHAPEIT2_phased.MAF001.vcf.gz
plink2 --make-bed --keep-allele-order --double-id \
--output-chr chrM \
--vcf GTEx_Analysis_2017-06-05_v8_WholeGenomeSeq_838Indiv_Analysis_Freeze.SHAPEIT2_phased.MAF001.vcf.gz \
--out GTEx.v8
python3 -m tensorqtl --mode cis \
GTEx.v8 \
Brain_Cortex.v8.normalized_expression.bed.gz \
Brain_Cortex \
--covariates Brain_Cortex.v8.covariates.txt \
--window 1000000 --seed 3042235018 --permutations 10000 --fdr 0.05 --qvalue_lambda 0.85 \
-o GTEx
seed from https://github.com/broadinstitute/tensorqtl/blob/master/example/GTEx_v8_example.ipynb Brain_Cortex related files are download directed from dbgap or gtex portal
And I use FastQTL to run permutation TUBB3 still a eGene,but not a eGene for GTEx portal download file
ENSG00000258947.6 5927 1.05606 500.296 139.718 0.000121028 chr16_89935882_T_C_b38 14490 150 200 0.487805 1 2.93434e-05 -0.164055 0.0381673 0.0474953 0.0489463 0.0466502 0.000131857
Sencond,
TUBB3 5927 SNPs have 7 monomorphic SNPs,I guess Tensorqtl maybe remove long INDEL,not all monomorphic SNPs
chr16_88975138_T_TCCA_b38
chr16_88975138_TGCAG_T_b38
chr16_89158355_C_A_b38
chr16_89158355_C_CTCACACGCCTGCATATGTCACATGTATGTGTATA_b38
chr16_89165405_G_GAGACCTGGATTGCAGGCCAGGCGGCAGC_b38
chr16_89165405_G_GC_b38
chr16_89230997_C_T_b38
chr16_89230997_CCTGGATGGACAGACCTGTGCACCTTCATGTCACGTGTTTGAAATT_C_b38
chr16_89233774_A_ACTGCGTATGCAACTCAG_b38
chr16_89233774_A_G_b38
chr16_89242688_A_C_b38
chr16_89242688_A_G_b38
chr16_89591801_C_A_b38
chr16_89591801_C_T_b38
Thanks in advanced!
None of the variants you listed are monomorphic (they all have some missingness though). You can get the exact VCF that was used for V8 from AnVIL: https://console.cloud.google.com/storage/browser/_details/fc-secure-ff8156a3-ddf3-42e4-9211-0fd89da62108/GTEx_Analysis_2017-06-05_v8_WGS_VCF_files/GTEx_Analysis_2017-06-05_v8_WholeGenomeSeq_838Indiv_Analysis_Freeze.SHAPEIT2_phased.MAF01.vcf.gz
The example notebook in this repository is to illustrate how V8 can be run, but does not exactly replicate it due to different seeds/software being used.
Thanks for quick reply! I made an incorrect assumption about monomorphic,I think you mean multi allelic variants.I query vcf for Brain_Cortex samples and find 4 monomorphic. ITensorqtl treat variants numbers more carefully than Fastqtl. Thanks a lot!
Tensorqtl skip some variants in cis mode,causing some genes become eGene. TUBB3 gene in GTEx Brain_Cortex
Brain_Cortex.v8.egenes.txt.gz Download from gtex portal TUBB3 test 5927 SNPs,qvalue 0.050
Tenosrqtl TUBB3 test 5923 SNPs,qvalue 0.0477745
check bim file used for GTEx Brain_Cortex Tensorqtl cis mode,5927 SNPs are inclued,4 SNPs are not tested
cis-nomial test 5927 SNPs