denshoproject / ddr-cmdln

Command-line tools for automating the Densho Digital Repository's various processes.
Other
0 stars 2 forks source link

File ingest fails #110

Closed gjost closed 5 years ago

gjost commented 5 years ago

File ingest fails when entity.json is not present in the list of staged files.

Background tasks page

Could not upload ddr-densho-349-46_inside_mezz.tif to ddr-testing-40097-1. Exception('Add file aborted, see log file for details: /var/log/ddr/addfile/ddr-testing-40097/ddr-testing-40097-1.log',)"

Addfile log

[2018-09-24T15:23:02.479497-07:00] ok - ------------------------------------------------------------------------
[2018-09-24T15:23:02.479553-07:00] ok - DDR.models.Entity.add_file: START
[2018-09-24T15:23:02.479584-07:00] ok - entity: ddr-testing-40097-1
[2018-09-24T15:23:02.479651-07:00] ok - data: {'sort': 1, 'path': u'/media/sf_ddrshared/Mezzzzezzzzez/ddr-densho-349-46_inside_mezz.tif', 'label': u'12
34567890-=`~!@#$%^&*()_+qwertyuiop[]\\{}|asdfghjkl;\':"zxcvbnm,./<>?QWERTYUIOPASDFGHJKLZXCVBNM', 'public': u'1', 'rights': u'cc'}
[2018-09-24T15:23:02.479715-07:00] ok - Examining source file
[2018-09-24T15:23:02.479794-07:00] ok - check dir /media/sf_ddrshared/Mezzzzezzzzez/ddr-densho-349-46_inside_mezz.tif (| src_path)
...
[2018-09-24T15:24:43.593925-07:00] ok - Writing entity.json
[2018-09-24T15:24:43.616339-07:00] ok - Staging files
[2018-09-24T15:24:43.624166-07:00] ok - | repo <git.Repo "/var/www/media/ddr/ddr-testing-40097/.git">
[2018-09-24T15:24:43.704570-07:00] ok - | 4 files to stage:
[2018-09-24T15:24:43.704665-07:00] ok - |   files/ddr-testing-40097-1/entity.json
[2018-09-24T15:24:43.704699-07:00] ok - |   files/ddr-testing-40097-1/files/ddr-testing-40097-1-mezzanine-860da76382.json
[2018-09-24T15:24:43.704731-07:00] ok - |   files/ddr-testing-40097-1/files/ddr-testing-40097-1-mezzanine-860da76382.tif
[2018-09-24T15:24:43.704772-07:00] ok - |   files/ddr-testing-40097-1/files/ddr-testing-40097-1-mezzanine-860da76382-a.jpg
[2018-09-24T15:24:43.704809-07:00] ok - git stage
[2018-09-24T15:24:43.780748-07:00] ok - annex stage
[2018-09-24T15:24:52.491224-07:00] ok - ok
[2018-09-24T15:24:52.503888-07:00] ok - | 3 files staged:
[2018-09-24T15:24:52.503978-07:00] ok - show_staged True
[2018-09-24T15:24:52.504019-07:00] ok - |   files/ddr-testing-40097-1/files/ddr-testing-40097-1-mezzanine-860da76382-a.jpg
[2018-09-24T15:24:52.504058-07:00] ok - |   files/ddr-testing-40097-1/files/ddr-testing-40097-1-mezzanine-860da76382.json
[2018-09-24T15:24:52.504091-07:00] ok - |   files/ddr-testing-40097-1/files/ddr-testing-40097-1-mezzanine-860da76382.tif
[2018-09-24T15:24:52.504127-07:00] not ok - 3 new files staged (should be 4)
[2018-09-24T15:24:52.504182-07:00] not ok - File staging aborted. Cleaning up
[2018-09-24T15:24:52.504257-07:00] not ok - | mv /var/www/media/ddr/ddr-testing-40097/files/ddr-testing-40097-1/files/ddr-testing-40097-1-mezzanine-860da76382.json /var/www/media/ddr/tmp/file-add/ddr-testing-40097/ddr-testing-40097-1/ddr-testing-40097-1-mezzanine-860da76382.json
[2018-09-24T15:24:52.506357-07:00] not ok - | link (not moving) /var/www/media/ddr/ddr-testing-40097/files/ddr-testing-40097-1/files/ddr-testing-40097-1-mezzanine-860da76382.tif
[2018-09-24T15:24:52.506421-07:00] not ok - | link (not moving) /var/www/media/ddr/ddr-testing-40097/files/ddr-testing-40097-1/files/ddr-testing-40097-1-mezzanine-860da76382-a.jpg
[2018-09-24T15:24:52.506459-07:00] not ok - finished cleanup. good luck...
[2018-09-24T15:24:52.506494-07:00] not ok - Add file aborted, see log file for details: /var/log/ddr/addfile/ddr-testing-40097/ddr-testing-40097-1.log
[2018-09-24T15:24:52.558893-07:00] not ok - DDRTask.ON_FAILURE

git status

Binary files are left staged but uncommitted, while the entity.json file deletion is unstaged. [NOTE: This is the state after several upload failures.] [NOTE: Cutting-and-pasting text from terminal is better than a screenshot] 110-testing40k97gitstatus

gjost commented 5 years ago

I believe this is a side-effect of the changes in 1dd04ba2cd where DDR.models.common.DDRObject.write_json only writes files if they're actually changed. The file ingest function needs to force a write. Better yet, the write_json function for Entity objects could be a little smarter and detect added/modified/deleted child File objects.

gjost commented 5 years ago

Added a force arg to write_json and used it in file ingest function. Fixed in commit 5312b1c.