mjordan / islandora_workbench

A command-line tool for managing content in an Islandora 2 repository
MIT License
26 stars 38 forks source link

Ingest of items fail with HTTP 422 #106

Open bseeger opened 4 years ago

bseeger commented 4 years ago

Not quite sure what I need to do here - I keep getting this error for all my items:

 11-May-20 16:17:02 - INFO - Term "Digital Document" (term ID 27) already exists in vocabulary "islandora_models".
11-May-20 16:17:02 - INFO - Term "PDFjs" (term ID 3) already exists in vocabulary "islandora_display".
11-May-20 16:17:02 - INFO - Term "eng" (term ID 44) already exists in vocabulary "language".
11-May-20 16:17:02 - WARNING - ERROR: {"message":"Unprocessable Entity: validation failed.\nfield_subject.0.target_id: The referenced entity (taxonomy_term: None) does not exist.\nfield_subject.0.target_id: This value should be of the correct primitive type.\nfield_tags: This value should not be null.\n"}
11-May-20 16:17:02 - WARNING - Node for CSV record 22 not created, HTTP response code was 422.
11-May-20 16:17:03 - INFO - Term "Audio" (term ID 22) already exists in vocabulary "islandora_models".
11-May-20 16:17:03 - INFO - Term "eng" (term ID 44) already exists in vocabulary "language".
11-May-20 16:17:03 - WARNING - ERROR: {"message":"Unprocessable Entity: validation failed.\nfield_subject.0.target_id: The referenced entity (taxonomy_term: None) does not exist.\nfield_subject.0.target_id: This value should be of the correct primitive type.\nfield_tags: This value should not be null.\n"}
11-May-20 16:17:03 - WARNING - Node for CSV record 23 not created, HTTP response code was 422.
11-May-20 16:17:03 - INFO - Term "Audio" (term ID 22) already exists in vocabulary "islandora_models".
11-May-20 16:17:03 - INFO - Term "eng" (term ID 44) already exists in vocabulary "language".
11-May-20 16:17:03 - WARNING - ERROR: {"message":"Unprocessable Entity: validation failed.\nfield_tags: This value should not be null.\n"}
11-May-20 16:17:03 - WARNING - Node for CSV record 24 not created, HTTP response code was 422.

my create.yml file and CSV file can be found here: https://gist.github.com/bseeger/8e07acb64bf1b42cadd34b458254684b

bseeger commented 4 years ago

Note -- One key thing I'm doing here is trying put items into a collection, so it might be a bit of a stretch for what you've got working so far.

bseeger commented 4 years ago

JSON that's getting sent with the POST to create the item:

{   "type":[      {
         "target_id":"islandora_object",
         "target_type":"node_type"

}

],
   "title":[      {
         "value":"Oral history of William C. Richardson"

}

],
   "status":[      {
         "value":1

}

],
   "field_language":[      {
         "target_id":"44",
         "target_type":"taxonomy_term"

}

],
   "field_member_of":[      {
         "target_id":"7",
         "target_type":"node_type"

}

],
   "field_edtf_date_issued":[      {
         "value":"2019-01-16T18:57:41Z"

}

],
   "field_identifier":[      {
         "value":"http://jhir.library.jhu.edu/handle/1774.2/59943"

}

],
   "field_rights":[      {
         "value":"Single copies may be made for research purposes. Researchers are responsible for determining any copyright questions. It is not necessary to seek our permission as the owner of the physical work to publish or otherwise use public domain materials that we have made available for use, unless Johns Hopkins University holds the copyright. If you are the copyright owner of this content and wish to contact us regarding our choice to provide access to this material online, please visit our takedown policy at https://www.library.jhu.edu/policy/digital-collections-statement-use-takedown-policy/."

}

],
   "field_description":[      {
         "value":"William C. Richardson was president of Johns Hopkins University and professor of health policy and management from 1990-1995. He holds an MBA and PhD in business from the University of Chicago, where he specialized in health care delivery. He also served as graduate dean at the University of Washington and as provost at Pennsylvania State University before being recruited to Johns Hopkins University. Following his presidency, Richardson became the head of the W.K. Kellogg Foundation. In this oral history, Richardson discusses his tenure as president, including the state of the university’s finances and departments at the time of his arrival and throughout his time at the institution. He touches on his first impressions of the university and the strategic decisions he made during his tenure as the university’s president. The interview took place over two sessions, both of which are available to access. This oral history is a part of the Mame Warren oral histories series."

}

],
   "field_edtf_date_created":[      {
         "value":"2019-01-16T18:57:41Z"

}

],
   "field_model":[      {
         "target_id":"22",
         "target_type":"taxonomy_term"

}

]
}
bseeger commented 4 years ago

Also, I already have some of these items in my system... But thought they'd get added again just fine, since I don't think this checks for duplicates based on title. Just sharing some variables that might be in play.

mjordan commented 3 years ago

@bseeger I'm sorry, I didn't see this issue until now.

Adding to a collection isn't a problem. --check will validate the existence of nodes referenced in field_member_of.

422 is the response you get when a taxonomy term ID doesn't exist for the field that you've got it in. I notice that you are telling workbench to create new terms on the fly, since you have added allow_adding_terms: true to your config file. If you don't have the string "None" in your CSV for the field you want to create terms for (in the above log output, field_subject), I suspect the "None" is coming from Python (it's its string value for NULL), so this is definitely a code-level error.

Would you be able to share your CSV file with me (no need for binaries), and an export of the vocabulary linked to your field_subject so I can try to replicate and debug? Emailing them to me is fine if you don't want to post them here.

Edit: no need for a dump of your vocabulary.

mjordan commented 3 years ago

Also, can you confirm your Drupal has its "Taxonomy term" REST endpoint enabled as described in Workbench's Requirements section?

mjordan commented 3 years ago

.... aaaaaaand I see you have provided your CSV in the gist. I'll investigate.

mjordan commented 3 years ago

@bseeger #111 is causing the problem. Specifically, trying to create a taxonomy term in a vocabulary that has fields other than the required term name (in your case, asking workbench to create terms in the Subject vocabulary) results in an invalid JSON structure, which causes the 422 (Unprocessable Entity).

Edit: see next comment.

mjordan commented 3 years ago

@bseeger My diagnosis in the last comment was incorrect. field_subject has multiple vocabularies linked to it (Corporate body, Family, Person, Geographic location, Subject). When the CSV adds new terms to a field that has multiple vocabs, the terms meed to be namespaced with the machine name of the target vocabulary (e.g., subject: Student life) in order to tell Drupal which vocabulary to add the new term to. The --check option will warn you if a field falls into this category.

mjordan commented 3 years ago

@bseeger another reason why ingest failed: #103. field_rights (at least its default configuration) has a length limit of 255. Your field_rights values are longer than that. I'll move on to resolving #103 so --check will warn you of this.

mjordan commented 3 years ago

103 is now resolved.

mjordan commented 3 years ago

@bseeger are you able to test this now that a couple of issues have been resolved? If not, I can do my best to smoke test with the data you provided.

bseeger commented 3 years ago

Hi @mjordan I can try to test this early next week.

mjordan commented 3 years ago

Thank you. I apologize for letting this sit in the issue queue for so long. Somehow I had notifications turned off on this repo (!).

I think that all of the issues that you encoutered have been resolved.

bseeger commented 3 years ago

No worries - I've been focused on other things, so this hasn't been an issue for me at all.

bseeger commented 3 years ago

Hi @mjordan - I tried this out and seem to get further along in the ingest, though I was unable to get it to succeed as there are other things going on (some might be my data). I don't have more time to look into it, but wanted to share this little bit of data with you.