In the process of adding more annotations from dbNSFP in VEP, I also noticed that some of the fields have hyphens which is causing problems. There are some of the fields in VEP:
M-CAP_score,M-CAP_pred,Eigen-phred,Eigen-PC-phred
These are in fact brought into the database by gemini. When I do gemini db_info I see them:
variants vep_provean_score TEXT
variants vep_provean_pred TEXT
variants vep_m-cap_score TEXT
variants vep_m-cap_pred TEXT
variants vep_revel_score TEXT
variants vep_revel_rankscore TEXT
variants vep_eigen-phred TEXT
variants vep_eigen-pc-phred TEXT
...
However, whey I try to do a query, it doesn't interpret the column correctly:
gemini query --header -q 'select gene, chrom, start, ref, alt, vep_eigen-phred from variants where impact_severity in ('HIGH', 'MED') and aaf_gnomad_all < 0.001' CCGO_801065.22.db | head
SQL error: (sqlite3.OperationalError) no such column: vep_eigen [SQL: u'select gene, chrom, start, ref, alt, vep_eigen-phred from variants where impact_severity in (HIGH, MED) and aaf_gnomad_all < 0.001']
Traceback (most recent call last):
File "/sysapps/cluster/software/Anaconda/2.3.0Linux-x86_64/envs/geminienv2/bin/gemini", line 7, in <module>
SQL error: (sqlite3.OperationalError) no such column: vep_eigen [SQL: u'select gene, chrom, start, ref, alt, vep_eigen-phred from variants where impact_severity in (HIGH, MED) and aaf_gnomad_all < 0.001']
gemini_main.main()
File "/sysapps/cluster/software/Anaconda/2.3.0Linux-x86_64/envs/geminienv2/lib/python2.7/site-packages/gemini/gemini_main.py", line 1244, in main
args.func(parser, args)
File "/sysapps/cluster/software/Anaconda/2.3.0Linux-x86_64/envs/geminienv2/lib/python2.7/site-packages/gemini/gemini_main.py", line 439, in query_fn
gemini_query.query(parser, args)
File "/sysapps/cluster/software/Anaconda/2.3.0Linux-x86_64/envs/geminienv2/lib/python2.7/site-packages/gemini/gemini_query.py", line 169, in query
run_query(args)
File "/sysapps/cluster/software/Anaconda/2.3.0Linux-x86_64/envs/geminienv2/lib/python2.7/site-packages/gemini/gemini_query.py", line 135, in run_query
gene_needed, args.show_families, subjects=subjects)
File "/sysapps/cluster/software/Anaconda/2.3.0Linux-x86_64/envs/geminienv2/lib/python2.7/site-packages/gemini/GeminiQuery.py", line 653, in run
self.result_proxy = res = iter(self._apply_query())
File "/sysapps/cluster/software/Anaconda/2.3.0Linux-x86_64/envs/geminienv2/lib/python2.7/site-packages/gemini/GeminiQuery.py", line 924, in _apply_query
res = self._execute_query()
File "/sysapps/cluster/software/Anaconda/2.3.0Linux-x86_64/envs/geminienv2/lib/python2.7/site-packages/gemini/GeminiQuery.py", line 883, in _execute_query
raise ValueError("The query issued (%s) has a syntax error." % self.query)
ValueError: The query issued (select gene, chrom, start, ref, alt, vep_eigen-phred from variants where impact_severity in (HIGH, MED) and aaf_gnomad_all < 0.001) has a syntax error.
Instead of finding vep_eigen-phred, it says it can't find vep_eigen. Is there any way for me to construct the query to retrieve the values in column vep_eigen-phred? If not, it would be a good idea to change hyphens to underscores when parsing the extra VEP fields so it doesn't cause this issue.
Hi,
In the process of adding more annotations from dbNSFP in VEP, I also noticed that some of the fields have hyphens which is causing problems. There are some of the fields in VEP:
These are in fact brought into the database by gemini. When I do
gemini db_info
I see them:However, whey I try to do a query, it doesn't interpret the column correctly:
Instead of finding
vep_eigen-phred
, it says it can't findvep_eigen
. Is there any way for me to construct the query to retrieve the values in columnvep_eigen-phred
? If not, it would be a good idea to change hyphens to underscores when parsing the extra VEP fields so it doesn't cause this issue.Thanks!
Andrew