SACGF / variantgrid

VariantGrid public repo
Other
23 stars 2 forks source link

Storing PubMed API errors - KeyError: 'PMID' #1002

Closed davmlaw closed 1 month ago

davmlaw commented 7 months ago

View details in Rollbar: https://app.rollbar.com/a/jimmy.andrews/fix/item/VariantGrid/4887

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/django/core/handlers/base.py", line 179, in _get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "/data/variantgrid/annotation/views.py", line 525, in citations_json
    cached_citations = get_citations(citations)
  File "/data/variantgrid/annotation/citations.py", line 218, in get_citations
    citations_by_citation_id[cite.pk] = get_citation_from_cached_citation(cite.cachedcitation)
  File "/data/variantgrid/annotation/citations.py", line 82, in get_citation_from_cached_citation
    citation_id = record["PMID"]
KeyError: 'PMID'
davmlaw commented 7 months ago

A Cached citation was stored with an error:

{'_state': <django.db.models.base.ModelState at 0x7f93e68c9f40>,
 'id': 31654,
 'created': datetime.datetime(2024, 2, 22, 5, 28, 35, 617134, tzinfo=<UTC>),
 'modified': datetime.datetime(2024, 2, 22, 5, 28, 35, 617193, tzinfo=<UTC>),
 'citation_id': 97071,
 'json_string': '{"{": [""], "   \\"": ["rsion\\":\\"1.0\\",", "pe\\": \\"error\\",", "iginalURL\\": \\" _skip_ \\",", "scription\\": \\"Failed to connect to pubone/efetch?db=pubmed&email=david.lawrence%40sa.gov.au&format=text&View=medline&tool=biopython&part=data&Start=0&uids=30474650,35984436&ncbi_sid=23BEFDE9D6CAC635_857ESID&ncbi_phid=D0BD2A910DB406F500003C19B23309F1.1.1.3 : finishConnect(..) failed: Connection refused: /10.74.128.28:4140 at remote address: /10.74.128.28:4140. Remote Info: Not Available\\""], "}": [""]}',
 'has_error': False}

When I deleted this, it worked. We shouldn't store temp errors into the DB

davmlaw commented 6 months ago

The code has changed a lot since VG3 - but the problem remains, if network drops out, it'll cache the temp error eg 'Temporary failure in name resolution' or "connection refused"

I've made citations decide whether they can refresh themselves and will do so if they have an error that isn't a "record not found"

davmlaw commented 4 months ago

Testing VG Test

I found a citation, which was linked to variant 1334966 by clinvar, then deleted it: Citation.objects.filter(pk='PMID:30727941').delete()

I then changed /etc/hosts:

127.0.1.1       eutils.ncbi.nlm.nih.gov

And went to:

http://test.variantgrid.com/variantopedia/view_variant/1334966

Page then renders correctly with:

PMID:30727941
Error when attempting to Entrez.efetch ids ['30727941'] : <urlopen error [Errno 111] Connection refused>

Reloaded the page, the citation now appears