Closed jfreidin closed 5 years ago
@jfreidin What instance name?
INFO:biocommons.seqrepo:biocommons.seqrepo 0.4.4
INFO:hgvs.dataproviders.seqfetcher:Using SeqRepo(/compbio_res/monitoring/seqrepo/2018-11-26) sequence fetching
INFO:hgvs.dataproviders.uta:connected to postgresql://anonymous:anonymous@uta.biocommons.org/uta/uta_20161216...
In [6]: hv.validate(hp.parse_hgvs_variant('NM_020975.4:c.2307_2309delGCGinsTCA'))
...
HGVSInvalidVariantError: NM_020975.4:c.2307_2309delinsTCA: Variant reference (GCG) does not agree with reference sequence (TCG)
In [7]: hv.validate(hp.parse_hgvs_variant('NM_020975.5:c.2307_2309delGCGinsTCA'))
...
HGVSDataNotAvailableError: No transcript definition for (tx_ac=NM_020975.5)
@jfreidin I should have updated the defaults long ago. The next release will use a newer UTA (uta_20171026).
I don't understand how the content is different between the public instance and the docker instance. The workflow has always been to build the public instances from exactly the same snapshots that were used. I also never made ad hoc changes to a database after deployment. (As it turns out, I learned recently that this was done on uta_20180821. That instance is broken. See biocommons/hgvs#537.)
@reece Thank you for clarifying the environmental difference.
My local Docker instance is:
UTA_DB_URL=postgresql://anonymous@localhost:15032/uta/uta_20171026
When I switched from the default URL to
UTA_DB_URL=postgresql://anonymous:anonymous@uta.biocommons.org/uta/uta_20171026
they behave the same:
In [2]: hv.validate(hp.parse_hgvs_variant('NM_020975.5:c.2307_2309delGCGinsTCA'))
INFO:biocommons.seqrepo.fastadir.fastadir:Opening for reading: /compbio_res/monitoring/seqrepo/2018-11-26/sequences/2017/1026/2234/1509057245.87.fa.bgz
Out[2]: True
Whew! I like reproducibility and thought that I'd blown it somewhere!
My local Docker instance of UTA has
['NM_020975.4', 'NM_020975.5']
, but the public UTA instance, on which I'm relying for an instance without Docker has only['NM_020975.4']
. This causes a difference in behavior forc.2307_2309GCG>TCA
. InNM_020975.4
,c.2307
is apparently aT
, but in bothNM_020975.4
andNC_000010.10
it's aG
. So on my local instance, I getwhereas using the public instance:
I suspect all that needs to happen is updating the public UTA server with the latest data?