mjordan / islandora_workbench

A command-line tool for managing content in an Islandora 2 repository
MIT License
26 stars 38 forks source link

Node for ... (record 1) created at https://islandora.traefik.me0. exception #222

Open DonRichards opened 3 years ago

DonRichards commented 3 years ago

This is probably something I'm doing wrong. I created a super simple CSV (3 rows) and this is the output.

OK, connection to Drupal at https://islandora.traefik.me verified.
Node for "Easthampton Town Hall (Large Image)" (record 1) created at https://islandora.traefik.me0.
Traceback (most recent call last):
  File "/home/don/Desktop/github/islandora_workbench/awesome_env/lib/python3.6/site-packages/urllib3/connection.py", line 160, in _new_conn
    (self._dns_host, self.port), self.timeout, **extra_kw
  File "/home/don/Desktop/github/islandora_workbench/awesome_env/lib/python3.6/site-packages/urllib3/util/connection.py", line 61, in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
  File "/usr/lib/python3.6/socket.py", line 745, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -2] Name or service not known

config.yml

task: create
host: "https://islandora.traefik.me/"
username: sadmin
password: password
input_csv: i8_sample_objects_workbench_OBJECTS_2.csv

How I run it

./workbench --config config.yml

python --version
Python 3.6.9

CSV

title,id,field_identifier,field_resource_type,field_genre,field_linked_agent,field_model,field_edtf_date_created,field_edtf_date_issued,field_place_published,field_description,field_rights,field_subject,field_member_of,field_language,field_display_hints,file
Easthampton Town Hall (Large Image),1,Easthampton Town Hall,still image,photograph,relators:cre:corporate_body:Historic American Buildings Survey|relators:pbl:corporate_body:Historic American Buildings Survey,Image,1933?,,,Photo number 1. Photographs are albumen prints mounted on cardboard.,No Copyright _ United States,"subject:Black, Francis",79,,Open Seadragon,EasthamptonTownHall.tif
Nehemiah Strong House (Large Image),2,Nehemiah Strong House,still image,photograph,relators:cre:corporate_body:Historic American Buildings Survey|relators:pbl:corporate_body:Historic American Buildings Survey,Image,1933?,,,Photo number 2. Photographs are albumen prints mounted on cardboard.,No Copyright _ United States,"subject:Chittendon, Chester P.",79,,Open Seadragon,NehemiahStrongHouse.tif
"Amherst College, Lawrence Observatory (Large Image)",3,"Amherst College, Lawrence Observatory",still image,photograph,relators:cre:corporate_body:Historic American Buildings Survey|relators:pbl:corporate_body:Historic American Buildings Survey,Image,1933?,,,Photo number 3. Photographs are albumen prints mounted on cardboard.,No Copyright _ United States,"subject:T'Shawn, David",79,,Open Seadragon,AmherstCollegeLawrenceObservatory.tif

Terminal output

stout.txt

mjordan commented 3 years ago

Don, can you try running:

curl -v "https://islandora.traefik.me/"

and let me know what the output is?

mjordan commented 3 years ago

Hmmm.. looking at the output more closely, it appears that the first node was actually created, so I assume that workbench explodes when it tries to create the media. To confirm this, can you add nodes_only: true to your config file and rerun workbench? If all of the nodes are created, it's definitely an issue with how workbench is creating media; in that case, can you confirm that you can create media on that Islandora instance the usual way, via the Drupal GUI?

mjordan commented 3 years ago

@DonRichards when you run workbench with --check what happens? When I run it, I get:

./workbench --config docker.yml --check Error: Workbench cannot detect whether the Islandora Workbench Integration module is enabled on http://localhost. Please ensure it is enabled.

Do you get the same thing?

DonRichards commented 3 years ago

@mjordan The curl returns a normal response.

After adding the suggested change this is what I see.

❯ ./workbench --config config.yml --check
OK, connection to Drupal at https://islandora.traefik.me verified.
"nodes_only" option in effect. Media files will not be checked/validated.
OK, configuration file has all required values (did not check for optional values).
OK, CSV file input_data/i8_sample_objects_workbench_OBJECTS_2.csv found.
OK, all 3 rows in the CSV file have the same number of columns as there are headers (17).
OK, CSV column headers match Drupal field names.
OK, required Drupal fields are present in the CSV file.
OK, ETDF field values in the CSV file validate.
Warning: Issues detected with validating typed relation field values in the CSV file. See the log for more detail.
Configuration and input data appear to be valid.
DonRichards commented 3 years ago

And this might be helpful

❯ ./workbench --config config.yml
OK, connection to Drupal at https://islandora.traefik.me verified.
"nodes_only" option in effect. No media will be created.
Node for "Easthampton Town Hall (Large Image)" (record 1) created at https://islandora.traefik.me/islandora/EasthamptonTownHallLargeImage.
Node for "Nehemiah Strong House (Large Image)" (record 2) created at https://islandora.traefik.me/islandora/NehemiahStrongHouseLargeImage.
Node for "Amherst College, Lawrence Observatory (Large Image)" (record 3) created at https://islandora.traefik.me/islandora/AmherstCollegeLawrenceObservatoryLargeImage.

❯ ./workbench --config rollback.yml
OK, connection to Drupal at https://islandora.traefik.me verified.
Node EasthamptonTownHallLargeImage not found or not accessible, skipping delete.
Node NehemiahStrongHouseLargeImage not found or not accessible, skipping delete.
Node AmherstCollegeLawrenceObservatoryLargeImage not found or not accessible, skipping delete.
mjordan commented 3 years ago

All signs point to the media REST endpoint. I'm currently not able to get an ISLE-DC environment up and running so can't really troubleshoot, but as soon as I do, I'll investigate.

DonRichards commented 3 years ago

No rush.

mjordan commented 3 years ago

rollback says that the nodes aren't found or accessible. Did they actually get created? The "created" message shouldn't be shown unless Drupal returns the correct response code (201).

DonRichards commented 3 years ago

They did get created but with no media.

DonRichards commented 3 years ago

Adding this here although I said this in slack already. I had to run $ python3 -m pip install python-magic because of another issue with the magic module not install. Removes the errors but I'm not seeing the media. This might be an issue with my environment so I'm going to reboot and check.

mjordan commented 3 years ago

The magic library is only used if the files you are ingesting are remote (e.g. paths to them start with http...). I don't think that's the problem. AFAIK you are the first person to try workbench with ISLE-DC so I assme that the problem is the routing to the REST endpoint that Islandora provides to create media.

One thing that is also puzzling is why rolling back isn't working, if the nodes have in fact been created.