denshoproject / ddr-cmdln

Command-line tools for automating the Densho Digital Repository's various processes.
Other
0 stars 2 forks source link

(develop) ddrimport fails to import external files #84

Closed GeoffFroh closed 6 years ago

GeoffFroh commented 6 years ago

ddrimport on develop will not import external type files from csv.

The included output (and attached csv) used ddr-csujad-28 as the target repository. A json file for the external file was produced in the repo (ddr-csujad-28-1-mezzanine-93bba06d9c); but the file metadata was not added to the file_groups attribute of the parent entity json. The command also created an empty directory named with the identifier of the external file.

(ddrlocal)ddr@DDREditor:/media/qnfs/kinkura/working/csujadimport/processed/oh/ddr-csujad-28/csuci_vcc-jic/csuci_vcc-jic_access$ ddr-import file /media/qnfs/kinkura/working/csujadimport/processed/oh/ddr-csujad-28/csuci_vcc-jic/csuci_vcc-jic_access/ddr-csujad-28-mezzanine-files-1.csv /media/qnfs/kinkura/gold/ddr-csujad-28 | tee -a /media/qnfs/kinkura/working/logs/ddrimport_ddr-csujad-28-files.log
2018-08-03 16:25:40,256 DEBUG    <DDR.identifier.Identifier collection:ddr-csujad-28>
2018-08-03 16:25:40,256 DEBUG    /media/qnfs/kinkura/gold/ddr-csujad-28
2018-08-03 16:25:40,256 INFO     batch import files ----------------------------
2018-08-03 16:25:40,260 DEBUG    csv_dir /media/qnfs/kinkura/working/csujadimport/processed/oh/ddr-csujad-28/csuci_vcc-jic/csuci_vcc-jic_access
2018-08-03 16:25:40,260 DEBUG    entity_class <class 'DDR.models.entity.Entity'>
2018-08-03 16:25:40,260 DEBUG    <git.Repo "/media/qnfs/kinkura/gold/ddr-csujad-28/.git">
2018-08-03 16:25:40,260 INFO     Reading /media/qnfs/kinkura/working/csujadimport/processed/oh/ddr-csujad-28/csuci_vcc-jic/csuci_vcc-jic_access/ddr-csujad-28-mezzanine-files-1.csv
2018-08-03 16:25:40,261 INFO     2 rows
2018-08-03 16:25:40,261 INFO     csv_load rowds
2018-08-03 16:25:40,272 DEBUG    ok
2018-08-03 16:25:40,272 INFO     - - - - - - - - - - - - - - - - - - - - - - - -
2018-08-03 16:25:40,272 INFO     Updating existing files
2018-08-03 16:25:40,272 DEBUG    0 updated in 0:00:00.000030
2018-08-03 16:25:40,272 INFO     - - - - - - - - - - - - - - - - - - - - - - - -
2018-08-03 16:25:40,272 INFO     Adding new files
2018-08-03 16:25:40,273 INFO     + 1/2 - ddr-csujad-28-1 (csuci_vcc-jic_0001_01.pdf)
2018-08-03 16:25:40,273 DEBUG    | parent <DDR.models.entity.Entity entity:ddr-csujad-28-1>
2018-08-03 16:26:10,710 DEBUG    |   file <DDR.models.files.File file:ddr-csujad-28-1-mezzanine-7ce51cb368>
2018-08-03 16:26:10,710 DEBUG    | 0:00:30.437395
2018-08-03 16:26:10,710 INFO     + 2/2 - ddr-csujad-28-1 (csuci_vcc-jic_0001_02.mp3)
2018-08-03 16:26:10,710 DEBUG    | parent <DDR.models.entity.Entity entity:ddr-csujad-28-1>
Traceback (most recent call last):
  File "/opt/ddr-local/venv/ddrlocal/bin/ddr-import", line 7, in <module>
    __import__('pkg_resources').run_script('ddr-cmdln==0.9.4b0', 'ddr-import')
  File "/opt/ddr-local/venv/ddrlocal/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 654, in run_script
type(file_) <class 'DDR.models.files.File'>
file_ <DDR.models.files.File file:ddr-csujad-28-1-mezzanine-93bba06d9c>
type(entity.files) <type 'list'>
    self.require(requires)[0].run_script(script_name, ns)
  File "/opt/ddr-local/venv/ddrlocal/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 1434, in run_script
    exec(code, namespace, namespace)
  File "/opt/ddr-local/venv/ddrlocal/lib/python2.7/site-packages/ddr_cmdln-0.9.4b0-py2.7.egg/EGG-INFO/scripts/ddr-import", line 228, in <module>
    main()
  File "/opt/ddr-local/venv/ddrlocal/lib/python2.7/site-packages/ddr_cmdln-0.9.4b0-py2.7.egg/EGG-INFO/scripts/ddr-import", line 212, in main
    row_end=row_end,
  File "/opt/ddr-local/venv/ddrlocal/local/lib/python2.7/site-packages/ddr_cmdln-0.9.4b0-py2.7.egg/DDR/batch.py", line 745, in import_files
    log_path, dryrun
  File "/opt/ddr-local/venv/ddrlocal/local/lib/python2.7/site-packages/ddr_cmdln-0.9.4b0-py2.7.egg/DDR/batch.py", line 849, in _add_new_files
    show_staged=False
  File "/opt/ddr-local/venv/ddrlocal/local/lib/python2.7/site-packages/ddr_cmdln-0.9.4b0-py2.7.egg/DDR/ingest.py", line 569, in add_external_file
    print('type(entity.files[0]) %s' % type(entity.files[0]))
IndexError: list index out of range

ddr-csujad-28-mezzanine-files-1.csv.txt

ddr-csujad-28-1-mezzanine-93bba06d9c.json.txt

GeoffFroh commented 6 years ago

(Note: we still have the local changes on the version of ddr-csujad-28 in the gold dir on the qumulo; but did not push them to mits, so you can test with the attached csv by cloning your own copy. We'll also leave the working dir in place here so you can remote in and take a look.)

sarabeckman commented 6 years ago

Derp -- wrong syntax. Still using the old ddr-import command instead of ddrimport