denshoproject / ddr-local

Web UI used for interacting with DDR collections and entities on a local machine.
Other
3 stars 0 forks source link

entity_edit task crashes if file is missing from IA #299

Closed gjost closed 11 months ago

gjost commented 2 years ago

User tried to save ddr-densho-435-1-1 and got the following error. According to the logic of DDR.archivedotorg.is_iaobject and DDR.archivedotorg.get_ia_metadata there should be metadata at the Internet Archive but there is not yet.

Traceback (most recent call last):
  File "/opt/ddr-local/venv/ddrlocal/lib/python3.7/site-packages/celery/app/trace.py", line 412, in trace_task
    R = retval = fun(*args, **kwargs)
  File "/opt/ddr-local/venv/ddrlocal/lib/python3.7/site-packages/celery/app/trace.py", line 704, in __protected_call__
    return self.run(*args, **kwargs)
  File "/opt/ddr-local/ddrlocal/webui/tasks/entity.py", line 82, in entity_edit
    git_name, git_mail, collection, form_data
  File "/opt/ddr-local/ddrlocal/webui/models.py", line 773, in save
    docstore.Docstore().post(self)
  File "/opt/ddr-local/venv/ddrlocal/lib/python3.7/site-packages/ddr_cmdln-5.2.3-py3.7.egg/DDR/docstore.py", line 578, in post
    public_fields=public_fields, public=public, b2=b2
  File "/opt/ddr-local/venv/ddrlocal/lib/python3.7/site-packages/ddr_cmdln-5.2.3-py3.7.egg/DDR/models/common.py", line 420, in to_esobject
    d.ia_meta = archivedotorg.get_ia_meta(self)
  File "/opt/ddr-local/venv/ddrlocal/lib/python3.7/site-packages/ddr_cmdln-5.2.3-py3.7.egg/DDR/archivedotorg.py", line 29, in get_ia_meta
    raise FileNotFoundError(f'No Internet Archive data for {o.identifier.id}.')
FileNotFoundError: No Internet Archive data for ddr-densho-435-1-1.

Task should catch the error and not cause a crash.

gjost commented 2 years ago

Fixed in ddr-local commit b0c67bd for package ddrlocal-master-5.2.4.

pkikawa commented 2 years ago

still getting

Exception Type: FileNotFoundError at /ui/entity/ddr-testing-40379-19/new-idservice/
Exception Value: No Internet Archive data for ddr-testing-40379-19-2.

from new segment.

full traceback:

Traceback (most recent call last):
  File "/opt/ddr-local/venv/ddrlocal/lib/python3.7/site-packages/django/core/handlers/exception.py", line 34, in inner
    response = get_response(request)
  File "/opt/ddr-local/venv/ddrlocal/lib/python3.7/site-packages/django/core/handlers/base.py", line 115, in _get_response
    response = self.process_exception_by_middleware(e, request)
  File "/opt/ddr-local/venv/ddrlocal/lib/python3.7/site-packages/django/core/handlers/base.py", line 113, in _get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "/opt/ddr-local/ddrlocal/webui/decorators.py", line 18, in wrapper
    return f(*args, **kwargs)
  File "/opt/ddr-local/ddrlocal/webui/views/decorators.py", line 18, in inner
    return func(request, *args, **kwargs)
  File "/opt/ddr-local/ddrlocal/storage/decorators.py", line 83, in inner
    return func(request, *args, **kwargs)
  File "/opt/ddr-local/ddrlocal/webui/views/entities.py", line 396, in new_idservice
    entity = _create_entity(request, eidentifier, collection, git_name, git_mail)
  File "/opt/ddr-local/ddrlocal/webui/views/entities.py", line 317, in _create_entity
    exit,status = Entity.create(eidentifier, git_name, git_mail)
  File "/opt/ddr-local/ddrlocal/webui/models.py", line 740, in create
    docstore.Docstore().post(entity)
  File "/opt/ddr-local/venv/ddrlocal/lib/python3.7/site-packages/ddr_cmdln-5.2.3-py3.7.egg/DDR/docstore.py", line 578, in post
    public_fields=public_fields, public=public, b2=b2
  File "/opt/ddr-local/venv/ddrlocal/lib/python3.7/site-packages/ddr_cmdln-5.2.3-py3.7.egg/DDR/models/common.py", line 420, in to_esobject
    d.ia_meta = archivedotorg.get_ia_meta(self)
  File "/opt/ddr-local/venv/ddrlocal/lib/python3.7/site-packages/ddr_cmdln-5.2.3-py3.7.egg/DDR/archivedotorg.py", line 29, in get_ia_meta
    raise FileNotFoundError(f'No Internet Archive data for {o.identifier.id}.')

Exception Type: FileNotFoundError at /ui/entity/ddr-testing-40379-19/new-idservice/
Exception Value: No Internet Archive data for ddr-testing-40379-19-3.
gjost commented 2 years ago

Sigh...

gjost commented 2 years ago

@pkikawa How did you test this? I installed the ddrlocal-master-5.2.4 package on a test VM, checked out ddr-densho-435 and was able use the ddr-local web UI to edit and save ddr-densho-435-1-1 with no problems. That object is provoking an IA error, as you can see with this ddrindex run on the same VM. Note that ddrindex displays the error but does not stop execution.

(ddrlocal) ddr@densho102:/opt/ddr-local$ ddrindex publish /var/www/media/ddr/ddr-densho-435/files/ddr-densho-435-1/ -r --force
2021-10-14 10:42:20.103886-07:00 | 1/2 POST ddr-densho-435-1 
Traceback (most recent call last):
  File "/opt/ddr-local/venv/ddrlocal/lib/python3.7/site-packages/ddr_cmdln-5.2.3-py3.7.egg/DDR/docstore.py", line 696, in post_multi
    document, parents=parents, b2=b2_synced, force=True
  File "/opt/ddr-local/venv/ddrlocal/lib/python3.7/site-packages/ddr_cmdln-5.2.3-py3.7.egg/DDR/docstore.py", line 578, in post
    public_fields=public_fields, public=public, b2=b2
  File "/opt/ddr-local/venv/ddrlocal/lib/python3.7/site-packages/ddr_cmdln-5.2.3-py3.7.egg/DDR/models/common.py", line 420, in to_esobject
    d.ia_meta = archivedotorg.get_ia_meta(self)
  File "/opt/ddr-local/venv/ddrlocal/lib/python3.7/site-packages/ddr_cmdln-5.2.3-py3.7.egg/DDR/archivedotorg.py", line 29, in get_ia_meta
    raise FileNotFoundError(f'No Internet Archive data for {o.identifier.id}.')
FileNotFoundError: No Internet Archive data for ddr-densho-435-1.
ERROR: not created
2021-10-14 10:42:22.746674-07:00 | 2/2 POST ddr-densho-435-1-1 
Traceback (most recent call last):
  File "/opt/ddr-local/venv/ddrlocal/lib/python3.7/site-packages/ddr_cmdln-5.2.3-py3.7.egg/DDR/docstore.py", line 696, in post_multi
    document, parents=parents, b2=b2_synced, force=True
  File "/opt/ddr-local/venv/ddrlocal/lib/python3.7/site-packages/ddr_cmdln-5.2.3-py3.7.egg/DDR/docstore.py", line 578, in post
    public_fields=public_fields, public=public, b2=b2
  File "/opt/ddr-local/venv/ddrlocal/lib/python3.7/site-packages/ddr_cmdln-5.2.3-py3.7.egg/DDR/models/common.py", line 420, in to_esobject
    d.ia_meta = archivedotorg.get_ia_meta(self)
  File "/opt/ddr-local/venv/ddrlocal/lib/python3.7/site-packages/ddr_cmdln-5.2.3-py3.7.egg/DDR/archivedotorg.py", line 29, in get_ia_meta
    raise FileNotFoundError(f'No Internet Archive data for {o.identifier.id}.')
FileNotFoundError: No Internet Archive data for ddr-densho-435-1-1.
ERROR: not created
{'total': 2, 'skipped': 0, 'successful': 0, 'bad': [{'path': '/var/www/media/ddr/ddr-densho-435/files/ddr-densho-435-1', 'identifier': <DDR.identifier.Identifier entity:ddr-densho-435-1>, 'action': 'POST', 'note': ''}, {'path': '/var/www/media/ddr/ddr-densho-435/files/ddr-densho-435-1/files/ddr-densho-435-1-1', 'identifier': <DDR.identifier.Identifier segment:ddr-densho-435-1-1>, 'action': 'POST', 'note': ''}]}
gjost commented 2 years ago

For good measure, just in case some state has crept into my test VM, I spun up a completely new and pristine Debian 10 VM and installed the ddrlocal-master_5.2.4~deb10_amd64.deb package on it. After getting ddr set up with Gitolite credentials I git clone-ed ddr-densho-435 and edited ddr-densho-435-1-1 with success. Again ddrindex ran with the same results as in the last comment.

I don't think this is a problem with the code. It's some problem with the setup on the VM, or with the collection itself. Maybe that machine is sitting near a source of magnetism or cosmic rays?

gjost commented 2 years ago

Confirmed that adding a new segment produces the error. It would have saved me some time if you could have clearly stated that you got the error while adding a new segment. (To be fair, it would have helped if I'd actually read through the stack trace in detail.)

gjost commented 2 years ago

Fixed in ddr-local commit 1836dd2 for package ddrlocal-master-5.2.5.

gjost commented 2 years ago

We need to catch FileNotFoundError in webui/tasks/files.py too. Sara encountered error with ddr-densho-435-1-1:

Traceback (most recent call last):
  File "/opt/ddr-local/venv/ddrlocal/lib/python3.7/site-packages/celery/app/trace.py", line 412, in trace_task
    R = retval = fun(*args, **kwargs)
  File "/opt/ddr-local/venv/ddrlocal/lib/python3.7/site-packages/celery/app/trace.py", line 704, in __protected_call__
    return self.run(*args, **kwargs)
  File "/opt/ddr-local/ddrlocal/webui/tasks/files.py", line 483, in set_signature              < < < < < < < < < < < < < < < <
    {}
  File "/opt/ddr-local/ddrlocal/webui/models.py", line 773, in save
    docstore.Docstore().post(self)
  File "/opt/ddr-local/venv/ddrlocal/lib/python3.7/site-packages/ddr_cmdln-5.2.3-py3.7.egg/DDR/docstore.py", line 578, in post
    public_fields=public_fields, public=public, b2=b2
  File "/opt/ddr-local/venv/ddrlocal/lib/python3.7/site-packages/ddr_cmdln-5.2.3-py3.7.egg/DDR/models/common.py", line 420, in to_esobject
    d.ia_meta = archivedotorg.get_ia_meta(self)
  File "/opt/ddr-local/venv/ddrlocal/lib/python3.7/site-packages/ddr_cmdln-5.2.3-py3.7.egg/DDR/archivedotorg.py", line 29, in get_ia_meta
    raise FileNotFoundError(f'No Internet Archive data for {o.identifier.id}.')
FileNotFoundError: No Internet Archive data for ddr-densho-435-1-1.
gjost commented 2 years ago

Fixed inddr-local commit 24d53b9 in package ddrlocal-master_5.2.6.