archivematica / Issues

Issues repository for the Archivematica project
GNU Affero General Public License v3.0
16 stars 1 forks source link

Problem: There isn't a lot of information on post-store callback in Archivematica, especially the original <source_id> post-store AIP call-back #947

Open ross-spencer opened 4 years ago

ross-spencer commented 4 years ago

Version of the documentation

Archivematica Storage Service: 0.15.

Page (and section, if applicable) where the issue occurs

Storage service API wiki and docs themselves.

image

Description of the issue

The storage service supports two types of call-back. A callback is a communication to another server, following a specific trigger (the successful storage of an AIP, or explicit call to the server to repeat the message so it can be consumed if missed previously.).

The documentation around the old-style format is a little limited, and difficult to trigger. The old-style call-back is specifically Post-store AIP (source files) or post_store rather than post_store_<package_type> e.g. post_store_aip, post_store_dip, post_store_aic. This API endpoint is entirely for the old-style callback.

Eliciting a response from the endpoint to see what we can expect...

If I create an artificial state that will help us to mimic the interaction, i.e. manually add files to the Files model for a specific AIP, then I can make an API call as follows:

http --pretty=format \
    GET "http://127.0.0.1:62081/api/v2/file/420216b1-692e-40ec-9cfa-e94b3271a2d2/send_callback/post_store/" \
    Authorization:"ApiKey test:test"

If I set up a server to receive arbitrary requests, I can read the request as per below:

2019/10/15 11:38:51 POST / HTTP/1.1
Host: 127.0.0.1:8080
connection: keep-alive
accept-encoding: gzip, deflate
accept: */*
user-agent: python-requests/2.21.0
content-type: application/json
custom-header: Kind regards
content-length: 57

2019/10/15 11:38:51 Headers: map[Accept-Encoding:[gzip, deflate] Accept:[*/*] User-Agent:[python-requests/2.21.0] Content-Type:[application/json] Custom-Header:[Kind regards] Content-Length:[57] Connection:[keep-alive]]
2019/10/15 11:38:51 Content: map[File UUID:bc01b7b7-9339-424a-a469-42bc8698661c]

In this instance, the Content translates to JSON:

{
   "File UUID": "bc01b7b7-9339-424a-a469-42bc8698661c"
}

If there were three files in this AIP, this would repeat three times with different file UUIDs.

The response to my original API call is:

HTTP/1.1 204 NO CONTENT
Connection: keep-alive
Content-Language: en
Content-Length: 0
Date: Tue, 15 Oct 2019 11:54:33 GMT
Server: nginx/1.16.0
Vary: Accept, Accept-Language, Cookie
X-Frame-Options: SAMEORIGIN

HttpNoContent is on the happy path in the code and so we should update the docs to say that an expected response from the server for a GET request is a 204.

If the message cannot be delivered, then the response indicates that it could not be sent.

HTTP/1.1 500 INTERNAL SERVER ERROR
Connection: keep-alive
Content-Language: en
Content-Type: application/json
Date: Tue, 15 Oct 2019 11:56:14 GMT
Server: nginx/1.16.0
Transfer-Encoding: chunked
Vary: Accept, Accept-Language, Cookie
X-Frame-Options: SAMEORIGIN

{
    "callback_uris": [
        "http://127.0.0.1:8080"
    ],
    "failure_count": 3,
    "message": "Failed to POST 3 responses to callback URI"
}

The callback is configured as follows:

image

If the AIP is stored correctly then this callback is triggered. Again artificially populating the File model makes this response work for us.

Source ID translates to a UUID assigned to a specific file in an AIP per the File model.

This model is never normally populated per https://github.com/archivematica/Issues/issues/342.

Suggested fix

  1. Incorporate specifics about the callback into the various documentation, (wiki, apid-docs).
  2. Specifically, it seems this call is limited to SWORD2 workflows around Archidora/Islandora.
  3. Other parts of the configuration work the same as the new-style callback, in that headers are configurable.
  4. Document that this API endpoint doesn't cover the new-style callbacks at all, and for that, a user would need functionality like https://github.com/archivematica/Issues/issues/1164 or https://github.com/archivematica/Issues/issues/881 suggests.

Other things that we may consider include some sort of automated test (AMAUAT) to trigger this call for Fedora style SWORD2 ingests to make sure this works consistently with the integration. The feature file and code would provide useful reference for developers in future.

Additional information

Most of the information above is included to (hopefully) make it easier for someone looking at this in future to understand the capability of the endpoint. Some additional context is included below.


For Artefactual use:

Before you close this issue, you must check off the following:

replaceafill commented 4 years ago

I investigated this endpoint today and found an additional issue: it assumes that the checksum algorithm is always SHA-512 and looks for a manifest-sha512.txt file in the bag layout of the AIP.

However the default value for the checksum algorithm setting is sha256 which produces a manifest-sha256.txt file instead, and the user can also modify it from the dashboard.

ross-spencer commented 4 years ago

@replaceafill you might find the linked issue helpful with regard to sha-512. I'll add the link again here: https://github.com/artefactual/archivematica-storage-service/issues/225.

replaceafill commented 4 years ago

Ah, thanks @ross-spencer! I definitely missed that link :grin: