andresgsaravia / research-engine

A platform in Google App Engine to facilitate reasearchers' life.
https://research-engine.appspot.com/
16 stars 12 forks source link

Bug parsing an article's XML #156

Open andresgsaravia opened 11 years ago

andresgsaravia commented 11 years ago

I just tried to add an article (doi:10.1086/309571) to the library and got the following error:

list index out of range
Traceback (most recent call last):
  File "/base/data/home/runtimes/python27/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 1511, in __call__
    rv = self.handle_exception(request, response, e)
  File "/base/data/home/runtimes/python27/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 1505, in __call__
    rv = self.router.dispatch(request, response)
  File "/base/data/home/runtimes/python27/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 1253, in default_dispatcher
    return route.handler_adapter(request, response)
  File "/base/data/home/runtimes/python27/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 1077, in __call__
    return handler.dispatch()
  File "/base/data/home/runtimes/python27/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 547, in dispatch
    return self.handle_exception(e, self.app.debug)
  File "/base/data/home/runtimes/python27/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 545, in dispatch
    return method(*args, **kwargs)
  File "/base/data/home/apps/s~research-engine/0.370752145895514202/src/bibliography.py", line 191, in post
    metadata = parse_xml(item_dom, kind)
  File "/base/data/home/apps/s~research-engine/0.370752145895514202/src/bibliography.py", line 54, in parse_xml
    res["date"] = dom.getElementsByTagName("journal_issue")[0].getElementsByTagName("year")[0].childNodes[0].nodeValue
IndexError: list index out of range

It seems parsing Crossref's XML is still going to be a problem...

andresgsaravia commented 10 years ago

Another bug with 10.1007/s10948-013-2399-6

Traceback (most recent call last):
  File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/runtime/wsgi.py", line 266, in Handle
    result = handler(dict(self._environ), self._StartResponse)
  File "/base/data/home/runtimes/python27/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 1519, in __call__
    response = self._internal_error(e)
  File "/base/data/home/runtimes/python27/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 1511, in __call__
    rv = self.handle_exception(request, response, e)
  File "/base/data/home/runtimes/python27/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 1505, in __call__
    rv = self.router.dispatch(request, response)
  File "/base/data/home/runtimes/python27/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 1253, in default_dispatcher
    return route.handler_adapter(request, response)
  File "/base/data/home/runtimes/python27/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 1077, in __call__
    return handler.dispatch()
  File "/base/data/home/apps/s~research-engine/1-0-0.372448783206571186/src/generic.py", line 266, in dispatch
    webapp2.RequestHandler.dispatch(self)
  File "/base/data/home/runtimes/python27/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 547, in dispatch
    return self.handle_exception(e, self.app.debug)
  File "/base/data/home/runtimes/python27/python27_lib/versions/third_party/webapp2-2.3/webapp2.py", line 545, in dispatch
    return method(*args, **kwargs)
  File "/base/data/home/apps/s~research-engine/1-0-0.372448783206571186/src/bibliography.py", line 199, in post
    metadata = parse_xml(item_dom, kind)
  File "/base/data/home/apps/s~research-engine/1-0-0.372448783206571186/src/bibliography.py", line 54, in parse_xml
    res["date"] = dom.getElementsByTagName("journal_issue")[0].getElementsByTagName("year")[0].childNodes[0].nodeValue
IndexError: list index out of range

and 10.1088/0953-8984/26/4/045501 produces an entry with no title.

andresgsaravia commented 10 years ago

Another one with doi:10.1007/978-3-540-71023-3_8 I really should look into this...

andresgsaravia commented 8 years ago

And yet another one with a similar error 10.1086/118137