UtrechtUniversity / yoda

A system for reliable, long-term storing and archiving large amounts of research data during all stages of a study.
https://utrechtuniversity.github.io/yoda/
GNU General Public License v3.0
46 stars 27 forks source link

[BUG] Internal Server Error on publication node when using verb=ListRecords #236

Closed fj-morales closed 1 year ago

fj-morales commented 1 year ago

Is there an existing issue for this?

Current Behavior

Clicking on https://publication.xxxx.xxxx/oai/oai/?verb=ListRecords&metadataPrefix=datacite gives a Internal Server Error. The server in question already has published packages.

Also, the /var/log/hhtpd/error_log file shows:

[Tue Mar 21 10:13:14.332636 2023] [wsgi:error] [pid 32382] [remote 145.90.233.58:33666] mod_wsgi (pid=32382): Exception occurred processing WSGI script '/var/www/moai/moai.wsgi'. [Tue Mar 21 10:13:14.333274 2023] [wsgi:error] [pid 32382] [remote 145.90.233.58:33666] Traceback (most recent call last): [Tue Mar 21 10:13:14.333329 2023] [wsgi:error] [pid 32382] [remote 145.90.233.58:33666] File "/var/www/moai/yoda-moai/venv/lib/python3.6/site-packages/paste/urlmap.py", line 216, in call [Tue Mar 21 10:13:14.333337 2023] [wsgi:error] [pid 32382] [remote 145.90.233.58:33666] return app(environ, start_response) [Tue Mar 21 10:13:14.333342 2023] [wsgi:error] [pid 32382] [remote 145.90.233.58:33666] File "/var/www/moai/yoda-moai/moai/wsgi.py", line 74, in call [Tue Mar 21 10:13:14.333343 2023] [wsgi:error] [pid 32382] [remote 145.90.233.58:33666] response = self.server.handle_request(WSGIRequest(request)) [Tue Mar 21 10:13:14.333347 2023] [wsgi:error] [pid 32382] [remote 145.90.233.58:33666] File "/var/www/moai/yoda-moai/moai/server.py", line 118, in handle_request [Tue Mar 21 10:13:14.333350 2023] [wsgi:error] [pid 32382] [remote 145.90.233.58:33666] return req.write(oai_server.handleRequest(req.query_dict()), 'text/xml') [Tue Mar 21 10:13:14.333371 2023] [wsgi:error] [pid 32382] [remote 145.90.233.58:33666] File "/var/www/moai/yoda-moai/venv/lib/python3.6/site-packages/oaipmh/server.py", line 314, in handleRequest [Tue Mar 21 10:13:14.333396 2023] [wsgi:error] [pid 32382] [remote 145.90.233.58:33666] return self.handleException(request_kw, sys.exc_info()) [Tue Mar 21 10:13:14.333413 2023] [wsgi:error] [pid 32382] [remote 145.90.233.58:33666] File "/var/www/moai/yoda-moai/venv/lib/python3.6/site-packages/oaipmh/server.py", line 326, in handleException [Tue Mar 21 10:13:14.333415 2023] [wsgi:error] [pid 32382] [remote 145.90.233.58:33666] self._tree_server.handleException(value).getroot(), [Tue Mar 21 10:13:14.333426 2023] [wsgi:error] [pid 32382] [remote 145.90.233.58:33666] File "/var/www/moai/yoda-moai/venv/lib/python3.6/site-packages/oaipmh/server.py", line 311, in handleRequest [Tue Mar 21 10:13:14.333439 2023] [wsgi:error] [pid 32382] [remote 145.90.233.58:33666] return self.handleVerb(verb, request_kw) [Tue Mar 21 10:13:14.333453 2023] [wsgi:error] [pid 32382] [remote 145.90.233.58:33666] File "/var/www/moai/yoda-moai/venv/lib/python3.6/site-packages/oaipmh/server.py", line 318, in handleVerb [Tue Mar 21 10:13:14.333468 2023] [wsgi:error] [pid 32382] [remote 145.90.233.58:33666] return etree.tostring(method(**kw).getroot(), [Tue Mar 21 10:13:14.333489 2023] [wsgi:error] [pid 32382] [remote 145.90.233.58:33666] File "/var/www/moai/yoda-moai/venv/lib/python3.6/site-packages/oaipmh/server.py", line 135, in listRecords [Tue Mar 21 10:13:14.333499 2023] [wsgi:error] [pid 32382] [remote 145.90.233.58:33666] kw) [Tue Mar 21 10:13:14.333518 2023] [wsgi:error] [pid 32382] [remote 145.90.233.58:33666] File "/var/www/moai/yoda-moai/venv/lib/python3.6/site-packages/oaipmh/server.py", line 217, in _outputResuming [Tue Mar 21 10:13:14.333532 2023] [wsgi:error] [pid 32382] [remote 145.90.233.58:33666] output_func(element, result, token_kw) [Tue Mar 21 10:13:14.333552 2023] [wsgi:error] [pid 32382] [remote 145.90.233.58:33666] File "/var/www/moai/yoda-moai/venv/lib/python3.6/site-packages/oaipmh/server.py", line 129, in outputFunc [Tue Mar 21 10:13:14.333554 2023] [wsgi:error] [pid 32382] [remote 145.90.233.58:33666] self._outputMetadata(e_record, metadataPrefix, metadata) [Tue Mar 21 10:13:14.333568 2023] [wsgi:error] [pid 32382] [remote 145.90.233.58:33666] File "/var/www/moai/yoda-moai/venv/lib/python3.6/site-packages/oaipmh/server.py", line 240, in _outputMetadata [Tue Mar 21 10:13:14.333582 2023] [wsgi:error] [pid 32382] [remote 145.90.233.58:33666] metadata_prefix, e_metadata, metadata) [Tue Mar 21 10:13:14.333597 2023] [wsgi:error] [pid 32382] [remote 145.90.233.58:33666] File "/var/www/moai/yoda-moai/venv/lib/python3.6/site-packages/oaipmh/metadata.py", line 52, in writeMetadata [Tue Mar 21 10:13:14.333600 2023] [wsgi:error] [pid 32382] [remote 145.90.233.58:33666] self._writers[metadata_prefix](element, metadata) [Tue Mar 21 10:13:14.333610 2023] [wsgi:error] [pid 32382] [remote 145.90.233.58:33666] File "/var/www/moai/yoda-moai/moai/metadata/datacite.py", line 49, in call [Tue Mar 21 10:13:14.333619 2023] [wsgi:error] [pid 32382] [remote 145.90.233.58:33666] # Language [Tue Mar 21 10:13:14.333643 2023] [wsgi:error] [pid 32382] [remote 145.90.233.58:33666] UnboundLocalError: local variable 'data' referenced before assignment

In the file /var/www/moai/yoda-moai/moai/metadata/datacite.py indeed the variable data is called before assignment due to this exception:

    def __call__(self, element, metadata):
        try:
            data = metadata.record['metadata']['metadata']
        except BaseException:
            pass

Expected Behavior

The information is shown and not Internal Server Error occurs, e.g. https://publication.xxxx.xxxx/oai/oai/?verb=ListIdentifiers&metadataPrefix=datacite

Steps To Reproduce

Publish datacite package Open link https://publication.xxxx.xxxx/oai/oai/?verb=ListRecords&metadataPrefix=datacite

Environment

- Yoda: 1.8.4
- Ansible: 2.9.7
- Operating System: Centos 7
- Browser: Google Chrome

Anything else?

No response

stsnel commented 1 year ago

The proximate cause of the problem was a data package which had only an XML metadata file. This format is no longer supported in Yoda 1.8. This causes MOAI to fail on reading the internal metadata for this data package. Usually such metadata would be converted when updating the publication endpoints during the upgrade to Yoda 1.6. Since this is apparently a test data package, perhaps the original data package was removed from the system after publication, so its metadata could not be upgraded to a supported format.

You can work around the issue by regenerating the MOAI database, which discards unsupported metadata:

cd /var/www/moai
mv moai.db moai.db.20230328
sudo -u moai /var/www/moai/yoda-moai/venv/bin/update_moai -q --config /var/www/moai/settings.ini yoda_moai
systemctl restart httpd 

From a development point of view, we need to make MOAI handle these errors more gracefully and make it easier to detect which metadata records are causing problems.

stsnel commented 1 year ago

Internal ticket number for improving MOAI error handling in this situation is YDA-5145.

stsnel commented 1 year ago

Fix has been implemented for Yoda 1.9.0; and is also a candidate to be backported to 1.8.7