clamsproject / app-dbpedia-spotlight-wrapper

CLAMS wrapper for DBpedia Spotlight
Apache License 2.0
0 stars 0 forks source link

random crash with requests to wikidata.org #8

Closed keighrim closed 1 year ago

keighrim commented 1 year ago

Bug Description

While running the new version of the app (from the main branch, not tagged yet), I found some outputs logged errors instead of annotations.

$ cat cpb-aacip-507-154dn40c26.dbps.mmif | jq ' . | .views[0].metadata.error.stackTrace | fromjson'
jq: error (at <stdin>:28): Invalid numeric literal at line 1, column 7 (while parsing '  File "/usr/local/lib/python3.8/site-packages/clams/restify/__init__.py", line 146, in post
    return self.json_to_response(self.cla.annotate(raw_data, **self.annotate_param_caster.cast(raw_params)))

  File "/usr/local/lib/python3.8/site-packages/clams/app/__init__.py", line 116, in annotate
    annotated = self._annotate(mmif, **runtime_params)

  File "/app/app.py", line 124, in _annotate
    entities = _get_ne_links(res_json)

  File "/app/app.py", line 108, in _get_ne_links
    grounding.extend(list(_get_qid(uri)))

  File "/app/app.py", line 53, in _get_qid
    res = sparql.query().convert()

  File "/usr/local/lib/python3.8/site-packages/SPARQLWrapper/Wrapper.py", line 960, in query
    return QueryResult(self._query())

  File "/usr/local/lib/python3.8/site-packages/SPARQLWrapper/Wrapper.py", line 926, in _query
    response = urlopener(request)

  File "/usr/local/lib/python3.8/urllib/request.py", line 222, in urlopen
    return opener.open(url, data, timeout)

  File "/usr/local/lib/python3.8/urllib/request.py", line 531, in open
    response = meth(req, response)

  File "/usr/local/lib/python3.8/urllib/request.py", line 640, in http_response
    response = self.parent.error(

  File "/usr/local/lib/python3.8/urllib/request.py", line 563, in error
    result = self._call_chain(*args)

  File "/usr/local/lib/python3.8/urllib/request.py", line 502, in _call_chain
    result = func(*args)

  File "/usr/local/lib/python3.8/urllib/request.py", line 755, in http_error_302
    return self.parent.open(new, timeout=req.timeout)

  File "/usr/local/lib/python3.8/urllib/request.py", line 525, in open
    response = self._open(req, data)

  File "/usr/local/lib/python3.8/urllib/request.py", line 542, in _open
    result = self._call_chain(self.handle_open, protocol, protocol +

  File "/usr/local/lib/python3.8/urllib/request.py", line 502, in _call_chain
    result = func(*args)

  File "/usr/local/lib/python3.8/urllib/request.py", line 1397, in https_open
    return self.do_open(http.client.HTTPSConnection, req,

  File "/usr/local/lib/python3.8/urllib/request.py", line 1358, in do_open
    r = h.getresponse()

  File "/usr/local/lib/python3.8/http/client.py", line 1348, in getresponse
    response.begin()

  File "/usr/local/lib/python3.8/http/client.py", line 316, in begin
    version, status, reason = self._read_status()

  File "/usr/local/lib/python3.8/http/client.py", line 285, in _read_status
    raise RemoteDisconnected("Remote end closed connection without"
')

Seems like the error is coming from SPARQL query to wikidata.org being not properly responded.

Reproduction steps

This error is rather random, probably depends on server loads on wikidata side, or some sort of security measure. Hence not consistently reproducible.

Expected behavior

No response

Screenshots

No response

Additional context

No response