archivematica / Issues

Issues repository for the Archivematica project
GNU Affero General Public License v3.0
16 stars 1 forks source link

Problem: Fixity fails for compressed AIPs if there isn't a pointer file for them #920

Open ross-spencer opened 5 years ago

ross-spencer commented 5 years ago

Expected behaviour

Fixity is robust enough to be able to continue to extract a package and check fixity without the need for a pointer file.

Current behaviour

We can observe the behavior by electing to purposely delete a pointer file, but first, let's check fixity works:

$ http -v --pretty=format \
>     GET "http://127.0.0.1:62081/api/v2/file/896555c9-e27d-43ec-8c2d-265428b74bde/check_fixity/" \
>     Authorization:"ApiKey test:test"     

GET /api/v2/file/896555c9-e27d-43ec-8c2d-265428b74bde/check_fixity/ HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Authorization: ApiKey test:test
Connection: keep-alive
Host: 127.0.0.1:62081
User-Agent: HTTPie/1.0.3
HTTP/1.1 200 OK
Cache-Control: no-cache
Connection: keep-alive
Content-Language: en
Content-Type: application/json
Date: Wed, 25 Sep 2019 16:07:24 GMT
Server: nginx/1.16.0
Transfer-Encoding: chunked
Vary: Accept, Accept-Language, Cookie
X-Frame-Options: SAMEORIGIN

{
    "failures": {
        "files": {
            "changed": [],
            "missing": [],
            "untracked": []
        }
    },
    "message": "",
    "success": true,
    "timestamp": null
}

Remove the pointer: rm storage_service/8965/55c9/e27d/43ec/8c2d/2654/28b7/4bde/pointer.896555c9-e27d-43ec-8c2d-265428b74bde.xml

Rerun fixity:

$ http -v --pretty=format     GET "http://127.0.0.1:62081/api/v2/file/896555c9-e27d-43ec-8c2d-265428b74bde/check_fixity/"     Authorization:"ApiKey test:test"     

GET /api/v2/file/896555c9-e27d-43ec-8c2d-265428b74bde/check_fixity/ HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Authorization: ApiKey test:test
Connection: keep-alive
Host: 127.0.0.1:62081
User-Agent: HTTPie/1.0.3
HTTP/1.1 500 INTERNAL SERVER ERROR
Connection: keep-alive
Content-Language: en
Content-Type: application/json
Date: Wed, 25 Sep 2019 16:07:56 GMT
Server: nginx/1.16.0
Transfer-Encoding: chunked
Vary: Accept-Language, Cookie
X-Frame-Options: SAMEORIGIN

{
    "error_message": "Error reading file '/var/archivematica/storage_service/8965/55c9/e27d/43ec/8c2d/2654/28b7/4bde/pointer.896555c9-e27d-43ec-8c2d-265428b74bde.xml': failed to load external entity \"/var/archivematica/storage_service/8965/55c9/e27d/43ec/8c2d/2654/28b7/4bde/pointer.896555c9-e27d-43ec-8c2d-265428b74bde.xml\"",
    "traceback": "Traceback (most recent call last):\n\n  File \"/usr/local/lib/python2.7/site-packages/tastypie/resources.py\", line 220, in wrapper\n    response = callback(request, *args, **kwargs)\n\n  File \"/src/storage_service/locations/api/resources.py\", line 135, in wrapper\n    result = func(resource, request, bundle, **kwargs)\n\n  File \"/src/storage_service/locations/api/resources.py\", line 1366, in check_fixity_request\n    force_local=force_local\n\n  File \"/src/storage_service/locations/models/package.py\", line 1891, in get_fixity_check_report_send_signals\n    force_local=force_local\n\n  File \"/src/storage_service/locations/models/package.py\", line 1847, in check_fixity\n    path, temp_dir = self.extract_file()\n\n  File \"/src/storage_service/locations/models/package.py\", line 1511, in extract_file\n    compression = utils.get_compression(self.full_pointer_file_path)\n\n  File \"/src/storage_service/common/utils.py\", line 293, in get_compression\n    doc = etree.parse(pointer_path)\n\n  File \"src/lxml/lxml.etree.pyx\", line 3427, in lxml.etree.parse (src/lxml/lxml.etree.c:81117)\n\n  File \"src/lxml/parser.pxi\", line 1811, in lxml.etree._parseDocument (src/lxml/lxml.etree.c:117848)\n\n  File \"src/lxml/parser.pxi\", line 1837, in lxml.etree._parseDocumentFromURL (src/lxml/lxml.etree.c:118195)\n\n  File \"src/lxml/parser.pxi\", line 1741, in lxml.etree._parseDocFromFile (src/lxml/lxml.etree.c:117107)\n\n  File \"src/lxml/parser.pxi\", line 1138, in lxml.etree._BaseParser._parseDocFromFile (src/lxml/lxml.etree.c:111653)\n\n  File \"src/lxml/parser.pxi\", line 595, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:105109)\n\n  File \"src/lxml/parser.pxi\", line 706, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:106817)\n\n  File \"src/lxml/parser.pxi\", line 633, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:105628)\n\nIOError: Error reading file '/var/archivematica/storage_service/8965/55c9/e27d/43ec/8c2d/2654/28b7/4bde/pointer.896555c9-e27d-43ec-8c2d-265428b74bde.xml': failed to load external entity \"/var/archivematica/storage_service/8965/55c9/e27d/43ec/8c2d/2654/28b7/4bde/pointer.896555c9-e27d-43ec-8c2d-265428b74bde.xml\"\n"
}

Steps to reproduce

As above.

Your environment (version of Archivematica, OS version, etc)

Docker, running Archivematica 1.10.x.

Additional context

Spotted by @mamedin on a client's server.

Workaround

One option around this that we have explored with Ops internally is to use the storage service's import command to create a new pointer file with the existing AIP. Depending on your use-case this might be an appropriate way forward.


For Artefactual use: Please make sure these steps are taken before moving this issue from Review to Done:

ross-spencer commented 3 years ago

Related to https://github.com/archivematica/Issues/issues/616