archivematica / Issues

Issues repository for the Archivematica project
GNU Affero General Public License v3.0
16 stars 1 forks source link

Problem: SIPs started from ArchivesSpace pane fail when a parent object does not have a title #799

Closed djpillen closed 4 years ago

djpillen commented 5 years ago

Expected behaviour A SIP started from the ArchivesSpace pane creates a set of Dublin Core metadata for the SIP containing a "relation" element with the package's intellectual arrangement from ArchivesSpace. The intellectual arrangement should be able to consist of a combination of titles and/or dates.

Current behaviour A SIP started from the ArchivesSpace pane fails when a parent object does not contain a title.

Steps to reproduce Create an intellectual arrangement in ArchivesSpace consisting of at least one archival object with only a date subrecord and no title. Add a new archival object to that arrangement in the Appraisal tab's ArchivesSpace pane, create a new digital object under that archival object, drag and drop some content onto the digital object, and then select "Finalize Arrangement." A message indicating that there was an error finalizing arrangement and to check the dashboard logs will display. The following trace is from the dashboard logs:

ERROR     2019-07-16 15:59:32  django.request:base:handle_uncaught_exception:256:  Internal Server Error: /access/archivesspace/-repositories-2-archival_objects-989128/copy_from_arrange/
Traceback (most recent call last):
  File "/usr/share/python/archivematica-dashboard/local/lib/python2.7/site-packages/django/core/handlers/base.py", line 132, in get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "/usr/share/archivematica/dashboard/components/access/views.py", line 90, in wrapper
    return func(request, mapping, system=system)
  File "/usr/share/archivematica/dashboard/components/access/views.py", line 71, in wrapper
    return func(client, *args, **kwargs)
  File "/usr/share/archivematica/dashboard/components/access/views.py", line 455, in access_arrange_start_sip
    relation = [parent_obj['title']] + relation
KeyError: 'title'

Your environment (version of Archivematica, OS version, etc) First noticed on our production instance of Archivematica, which is on Archivematica v1.6, but the relevant code still exists in stable\1.9 and qa\1.x


For Artefactual use: Please make sure these steps are taken before moving this issue from Review to Done:

sromkey commented 5 years ago

Hey @djpillen ! Thanks for reporting and also the PR. We're at code freeze for 1.10 so I've put on the triage-release-1.11 label for potential inclusion there.

sallain commented 5 years ago

Was working a bit with ArchivesSpace today and looking up issues, and @jhsimpson has put a good explanation of the AS metadata requirements in a comment on another issue: https://github.com/archivematica/Issues/issues/149#issuecomment-478137198

So it seems like the requirement for title (as well as general note, rights statement, etc.) is a requirement to make the ASpace-DSpace-Archivematica integration all work together.

eckardm commented 5 years ago

Hi @sallain! I was using Archivematica today and actually ran into this very issue, which prompted me to talk to @djpillen, which prompted him to point me here.

Just so it's clear, this issue isn't related to having a title or not. That's definitely required for the integration.

However, for each component in ArchivesSpace, ArchivesSpace generates a "display string" corresponding to the title (if there's only a title), the date (if there's only a date), and a concatenation of the two (if both are present). All three are acceptable in ArchivesSpace, and all three end up in the display string.

Archivematica works fine except when when there's only a date because it's looking for the title in the title field. Looking for the title in display string instead (as Dallas suggests in his PR) would solve the problem, while ensuring there's a "title" for DSpace. Does that make sense?

ross-spencer commented 4 years ago

@sromkey we chatted in the backlog grooming yesterday. My notes from November were that the fix looks solid -- it's a small one-line fix (low-risk) and from the issue notes can make life a bit easier for ArchivesSpace folk.

What I feel is missing is we haven't any tests but that is in general for our ArchivesSpace integration components in the Dashboard.

The ArchivesSpace API docs on the anticipated cardinality of title or display_string could be clearer but if we look into the schemas in ArchivesSpace then I think we can see title as optional: here compare the ifmissing field to this one: here where I presume an error needs to be raised if the field is missing.

Compare to to @jambun's fix here and the corresponding schema for Accession Title: here which I think is consistent with what is being said here.

So I feel the logic is described well and we probably shouldn't assume title. The onus is still a little bit on us to just make sure that it's tested -- I'm not sure we have too many ways around that. One outcome might be to follow Jame's approach and look for title first (maintaining current behavior as much as possible) and then look for display string if it isn't there. @djpillen's do you have any immediate availability to attempt that?

I haven't an immediate answer to developing better tests here in short-order.

NB We have one example API request in our repo: here that has the display_string and title and we have some available in tutorials online here

FWIW I wasn't making these same connections when I first looked at this in November/December so if we end up missing out on this for 1.11 for any reason then that's on me somewhat.

sallain commented 4 years ago

@ross-spencer @sromkey Would it be helpful to separate out the need for tests into a separate ticket and triage it for 1.12? It would be really great to have some, but is not critical (very important!!! but not critical.)

ross-spencer commented 4 years ago

@sallain shall we? I like the statement of intent we make by doing that.

djpillen commented 4 years ago

Thanks for looking into this @ross-spencer! It's much appreciated. You're right that the ArchivesSpace API docs and the schemas do not make it very clear what the relationship is between display_string, title, and dates, so I've tracked down a few of the scattered bits of the ArchivesSpace code base that add up to "every archival object must have a title OR a date" and "every archival object will have a display string."

Here is where the "must have a title OR a date" validation occurs: https://github.com/archivesspace/archivesspace/blob/master/common/validations.rb#L389-L398

Here is where the conditional requirements are spelled out in the title tooltip and dates tooltip

Here is where a display_string is generated by concatenating the title (if it exists) and the first dates sub-record (if any exist): https://github.com/archivesspace/archivesspace/blob/master/backend/app/model/archival_object.rb#L64-L82

The proposed solution of looking for a title first, to maintain current behavior, and then looking for a display string if a title is not present sounds reasonable to me. Is that a change that you'd like for me to make to the pull request that I submitted?

ross-spencer commented 4 years ago

Hi @djpillen if you can, that would be great! And really appreciated. As you point out, the display string combines the two fields if present, I think the rationale to maintain the title if present should be to maintain some of the original behavior of Archivematica and reduce the potential for error.

These extra links are super helpful.

I'll be away later today and have a short day tomorrow, but I'll be around to hopefully review and merge this next week if you have the capacity during that time. That would work best in terms of helping keep the release on schedule but do let us know if that's too tight.

djpillen commented 4 years ago

I can make the modification to the code, but unfortunately I'm not sure that I can update the pull request -- the PR is currently coming from an "unknown repository". I must have deleted the repo or the branch at some point without realizing the downstream consequences for making modifications to this PR.

Is there a preferred way to resolve this? Close the current PR and submit a new one?

Thanks!

ross-spencer commented 4 years ago

Hmm, that is a pain, but easily done, and it's not such a huge issue on our side though, so a new PR would be perfectly acceptable!

djpillen commented 4 years ago

Done!

Thanks again.

sallain commented 4 years ago

I believe that this is now working. Following @djpillen's instructions, this is what I did:

  1. I created an archival object in ArchivesSpace with no title. It just has a level of description (required by AS) and a date of creation.
  2. In the ArchivesSpace pane of the Appraisal tab, I located my archival object.
  3. I selected my archival object and clicked Add new child record. I gave the child record a name, level of description, and dates (required by AM interface). I clicked Save.
  4. I selected the new child record and clicked Add new digital object.
  5. I dragged a file from the backlog pane to the new digital object.
  6. I selected the child record and clicked Finalize arrangement, and then proceeded with the ingest.

Both my child record and the digital object record were created in ArchivesSpace as expected.

Tested in qa/1.x on Bionic.