griffithlab / civicpy

A python interface for the CIViC db application
MIT License
9 stars 5 forks source link

accessing civicrecords from local cache #106

Closed vipints closed 3 years ago

vipints commented 4 years ago

Dear CIViCPy team,

May be I am doing something really wrong and I don't know why I am getting this message when I try to access civirecord as mentioned in the example code snippet:

nexus-253:~ vipin$ export CIVICPY_CACHE_FILE=/Users/vipin/Downloads/nightly-civicpy_cache.pkl 
nexus-253:~ vipin$ python 
Python 3.7.6 (default, Dec 22 2019, 01:09:06) 
[Clang 11.0.0 (clang-1100.0.33.12)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from civicpy import civic, version
>>> version() 
'1.1.2'
>>> variant = civic.get_variant_by_id(12)
WARNING:root:Local cache at /Users/vipin/Downloads/nightly-civicpy_cache.pkl is stale, updating from remote.
WARNING:root:Downloading remote cache from https://civicdb.org/downloads/nightly/nightly-civicpy_cache.pkl.
WARNING:root:Local cache at /Users/vipin/Downloads/nightly-civicpy_cache.pkl is stale, updating from remote.
....

This is just continuing on the python console and I am not able to access the records. Do you any idea what is wrong here. The machine have outside connection to access the service and I would like to use the cache file I have downloaded to get the records.

Regards, Vipin

ahwagner commented 4 years ago

Hi @vipints. Yes, we have paused live updates to the civicpy cache on production, which leads to this error. We will be patching CIViCpy to resolve this situation with a warning instead of a re-attempt.

In the interim, you can load as follows:

>>> from civicpy import civic
>>> civic.load_cache(on_stale='ignore')
>>> variant = civic.get_variant_by_id(12)
vipints commented 4 years ago

Great thanks @ahwagner! This works for me.

vipints commented 4 years ago

Hi again, I definitely would like to use this module for our routine report generation queries to get details from CIViCdb. In my case, I will have HGNC gene symbol and annotation for example:

ADORA1  ADORA1:c.341+11376C>T|
ERBB4   ERBB4:c.83-126671T>A|
KMT2A   KMT2A:c.5961+185C>T|;KMT2A:c.5952+185C>T|

and I believe I cannot perform a search on the cache based on the HGNC gene symbol and I have to go via CIViC gene identifier. How can I generate a mapping between HGNC gene symbol and CIViC gene identifier? Or do you see a different way to fetch the record from CIViC cache using civicpy module? Please share your suggestions with me. Thanks in advance!

susannasiebert commented 4 years ago

You should be able to search by HGVS string. See the documentation here.

vipints commented 4 years ago

Thanks @susannasiebert for the hint! I performed a quick test to fetch records with following query and didn't get any records.

In [26]: civic.search_variants_by_hgvs("c.83-126671T>A")                                          
Out[26]: []

In [27]: civic.search_variants_by_hgvs("c.5961+185C>T")                                           
Out[27]: []

something wrong with my query or CIViC doesn't have an entry for those variants? Thanks!

vipints commented 4 years ago

c.83-126671T>A matched to CIViC variant MUTATION (of gene ERBB4)

susannasiebert commented 4 years ago

Unfortunately, that method only works on variants that have manually curated HGVS expressions in CIVIC. https://civicdb.org/events/genes/1734/summary/variants/310/summary does not. On top of that, CIViC is not currently laid out to be able to search by hierarchical variants, e.g. c.83-126671T>A isa ERBB4 MUTATION. You could try searching by coordinates without ref and alt.

vipints commented 4 years ago

Understood. I think otherwise I have to provide the complete HGVS expression like NC_000007.13:g.140453136A>T Thanks!