Closed vkp3 closed 4 years ago
Hi,
Yes, this is definitely possible. The phenotypes and genotypes can be any continuous value, and all you'll need to do is specify a 'variant' mapping (in variant_df
) that links your 'genotype' values to a phenotype via the cis-window.
Here's an example of how you could do this:
import pandas as pd
import numpy as np
from tensorqtl import cis
np.random.seed(12345)
n = 100 # samples
m = 20 # genotypes
genotype_df = pd.DataFrame(np.random.rand(m, n))
phenotype_df = pd.Series(np.random.rand(n), name='phenotype_1').to_frame().T
phenotype_pos_df = pd.DataFrame({'chr':['chr1'], 'pos':[1]}, index=['phenotype_1'])
variant_df = pd.DataFrame({'chrom':['chr1']*m, 'pos':np.arange(m)}, index=genotype_df.index)
# permutations
cis_df = cis.map_cis(genotype_df, variant_df, phenotype_df, phenotype_pos_df, window=1000000)
# nominal associations
cis.map_nominal(genotype_df, variant_df, phenotype_df, phenotype_pos_df, 'test',
output_dir='.', window=1000000)
You'll need to use the latest commit if not including any covariates as in this example (I'll make a new release soon).
Edit: fix for phenotype_pos_df
change ('pos' instead of 'tss').
hi, this code result the error:KeyError: 'start',I'm wondering if there is another way to “encode the phenotype as a dosage for a single genetic variant, across individuals", in order to map pheno to pheno in the new version? thanks!
Please try again with the updated code above, it should work now.
thanks! i fix that,ecoding the phenotype following the updated code solves this problem!
Hello,
I posted this issue in @francois-a's package
fastqtl
, but since tensorqtl is similar, I'm writing the same question here:I'm attempting to replicate the gtex-pipeline using scripts from the gtex-pipeline repository on GitHub.
However, my use case is a bit different than eQTL mapping. I would like to run a phenotype-QTL mapping, where: instead of many single-variant association tests, I want to test a general phenotype and its association to expression of many genes amongst tissues.
Most importantly, I would like to achieve fast performance on permutation testing for computing p-values of such a phenotype and its association with gene expression (across genes).
My intuition is to encode the phenotype as a dosage for a single genetic variant, across individuals, but I am unsure if this is supported by FastQTL and/or TensorQTL.
So, I'm wondering if this could be possible? If so, could you help me encode this strategy for FastQTL or TensorQTL? I would like to utilize the programs' fast performance in terms of permutation testing.
Thank you