Closed andrewjbtw closed 4 years ago
Thanks for reporting, @andrewjbtw!
I am seeing this in 1.9 but not in 1.9.1 or the latest branch (proto-1.10), so I believe it is resolved. I think we should keep this issue open and make sure to QA it for our next 1.10 release. (Thanks Darren for adding the triagle label!)
I have moved this back into ready
to indicate that it looks like it still needs to be addressed.
I can recreate it with the following configuration locally with the latest commit: https://github.com/artefactual/archivematica/commit/42e137fdb716c19fa7e6d5f233bd13a3b0a1d5d6
<processingMCP>
<preconfiguredChoices>
<!-- Store DIP -->
<preconfiguredChoice>
<appliesTo>5e58066d-e113-4383-b20b-f301ed4d751c</appliesTo>
<goToChain>4500f34e-f004-4ccf-8720-5c38d0be2254</goToChain>
</preconfiguredChoice>
<!-- Select compression level -->
<preconfiguredChoice>
<appliesTo>01c651cb-c174-4ba4-b985-1d87a44d6754</appliesTo>
<goToChain>ecfad581-b007-4612-a0e0-fcc551f4057f</goToChain>
</preconfiguredChoice>
<!-- Examine contents -->
<preconfiguredChoice>
<appliesTo>accea2bf-ba74-4a3a-bb97-614775c74459</appliesTo>
<goToChain>e0a39199-c62a-4a2f-98de-e9d1116460a8</goToChain>
</preconfiguredChoice>
<!-- Perform file format identification (Submission documentation & metadata) -->
<preconfiguredChoice>
<appliesTo>087d27be-c719-47d8-9bbb-9a7d8b609c44</appliesTo>
<goToChain>4dec164b-79b0-4459-8505-8095af9655b5</goToChain>
</preconfiguredChoice>
<!-- Normalize (match 1 for "Do not normalize") -->
<preconfiguredChoice>
<appliesTo>cb8e5706-e73f-472f-ad9b-d1236af8095f</appliesTo>
<goToChain>89cb80dd-0636-464f-930d-57b61e3928b2</goToChain>
</preconfiguredChoice>
<!-- Normalize (match 2 for "Do not normalize") -->
<preconfiguredChoice>
<appliesTo>7509e7dc-1e1b-4dce-8d21-e130515fce73</appliesTo>
<goToChain>e8544c5e-9cbb-4b8f-a68b-6d9b4d7f7362</goToChain>
</preconfiguredChoice>
<!-- Bind PIDs -->
<preconfiguredChoice>
<appliesTo>05357876-a095-4c11-86b5-a7fff01af668</appliesTo>
<goToChain>fcfea449-158c-452c-a8ad-4ae009c4eaba</goToChain>
</preconfiguredChoice>
<!-- Create SIP(s) -->
<preconfiguredChoice>
<appliesTo>bb194013-597c-4e4a-8493-b36d190f8717</appliesTo>
<goToChain>61cfa825-120e-4b17-83e6-51a42b67d969</goToChain>
</preconfiguredChoice>
<!-- Delete packages after extraction -->
<preconfiguredChoice>
<appliesTo>f19926dd-8fb5-4c79-8ade-c83f61f55b40</appliesTo>
<goToChain>85b1e45d-8f98-4cae-8336-72f40e12cbef</goToChain>
</preconfiguredChoice>
<!-- Transcribe files (OCR) -->
<preconfiguredChoice>
<appliesTo>7079be6d-3a25-41e6-a481-cee5f352fe6e</appliesTo>
<goToChain>1170e555-cd4e-4b2f-a3d6-bfb09e8fcc53</goToChain>
</preconfiguredChoice>
<!-- Perform file format identification (Transfer) -->
<preconfiguredChoice>
<appliesTo>f09847c2-ee51-429a-9478-a860477f6b8d</appliesTo>
<goToChain>d97297c7-2b49-4cfe-8c9f-0613d63ed763</goToChain>
</preconfiguredChoice>
<!-- Store DIP location -->
<preconfiguredChoice>
<appliesTo>cd844b6e-ab3c-4bc6-b34f-7103f88715de</appliesTo>
<goToChain>/api/v2/location/default/DS/</goToChain>
</preconfiguredChoice>
<!-- Generate transfer structure report -->
<preconfiguredChoice>
<appliesTo>56eebd45-5600-4768-a8c2-ec0114555a3d</appliesTo>
<goToChain>df54fec1-dae1-4ea6-8d17-a839ee7ac4a7</goToChain>
</preconfiguredChoice>
<!-- Perform policy checks on originals -->
<preconfiguredChoice>
<appliesTo>70fc7040-d4fb-4d19-a0e6-792387ca1006</appliesTo>
<goToChain>3e891cc4-39d2-4989-a001-5107a009a223</goToChain>
</preconfiguredChoice>
<!-- Reminder: add metadata if desired -->
<preconfiguredChoice>
<appliesTo>eeb23509-57e2-4529-8857-9d62525db048</appliesTo>
<goToChain>5727faac-88af-40e8-8c10-268644b0142d</goToChain>
</preconfiguredChoice>
<!-- Generate thumbnails -->
<preconfiguredChoice>
<appliesTo>498f7a6d-1b8c-431a-aa5d-83f14f3c5e65</appliesTo>
<goToChain>972fce6c-52c8-4c00-99b9-d6814e377974</goToChain>
</preconfiguredChoice>
<!-- Store AIP -->
<preconfiguredChoice>
<appliesTo>2d32235c-02d4-4686-88a6-96f4d6c7b1c3</appliesTo>
<goToChain>9efab23c-31dc-4cbd-a39d-bb1665460cbe</goToChain>
</preconfiguredChoice>
<!-- Perform policy checks on access derivatives -->
<preconfiguredChoice>
<appliesTo>8ce07e94-6130-4987-96f0-2399ad45c5c2</appliesTo>
<goToChain>76befd52-14c3-44f9-838f-15a4e01624b0</goToChain>
</preconfiguredChoice>
<!-- Perform file format identification (Ingest) -->
<preconfiguredChoice>
<appliesTo>7a024896-c4f7-4808-a240-44c87c762bc5</appliesTo>
<goToChain>3c1faec7-7e1e-4cdd-b3bd-e2f05f4baa9b</goToChain>
</preconfiguredChoice>
<!-- Perform policy checks on preservation derivatives -->
<preconfiguredChoice>
<appliesTo>153c5f41-3cfb-47ba-9150-2dd44ebc27df</appliesTo>
<goToChain>b7ce05f0-9d94-4b3e-86cc-d4b2c6dba546</goToChain>
</preconfiguredChoice>
<!-- Assign UUIDs to directories -->
<preconfiguredChoice>
<appliesTo>bd899573-694e-4d33-8c9b-df0af802437d</appliesTo>
<goToChain>2dc3f487-e4b0-4e07-a4b3-6216ed24ca14</goToChain>
</preconfiguredChoice>
<!-- Document empty directories -->
<preconfiguredChoice>
<appliesTo>d0dfa5fc-e3c2-4638-9eda-f96eea1070e0</appliesTo>
<goToChain>65273f18-5b4e-4944-af4f-09be175a88e8</goToChain>
</preconfiguredChoice>
<!-- Send transfer to quarantine -->
<preconfiguredChoice>
<appliesTo>755b4177-c587-41a7-8c52-015277568302</appliesTo>
<goToChain>d4404ab1-dc7f-4e9e-b1f8-aa861e766b8e</goToChain>
</preconfiguredChoice>
<!-- Extract packages -->
<preconfiguredChoice>
<appliesTo>dec97e3c-5598-4b99-b26e-f87a435a6b7f</appliesTo>
<goToChain>79f1f5af-7694-48a4-b645-e42790bbf870</goToChain>
</preconfiguredChoice>
<!-- Upload DIP -->
<preconfiguredChoice>
<appliesTo>92879a29-45bf-4f0b-ac43-e64474f0f2f9</appliesTo>
<goToChain>6eb8ebe7-fab3-4e4c-b9d7-14de17625baa</goToChain>
</preconfiguredChoice>
</preconfiguredChoices>
</processingMCP>
But I can also see on https://sandbox.archivematica.org that if I use a similar configuration, and change the compression settings between a compressed AIP, and uncompressed AIP (both with thumbnails turned off) then I can recreate the issue in the latter case (uncompressed AIPs) as per my local config above.
Compressed AIP generation
Uncompressed AIP generation
There may be some other subtleties in the processing config settings, but its worth investigating further as it does seem this issue still stands.
I wonder what is left behind if the microservice does fail? Presumably there is a chance of having quite an excess of detritus left hanging around if the logs are correct in suggesting the objects and logs directories are not removed? For reference, the workflow provides three args:
"63f35161-ba76-4a43-8cfa-c38c6a2d5b2f": {
"config": {
"@manager": "linkTaskManagerDirectories",
"@model": "StandardTaskConfig",
"arguments": "-R \"%SIPLogsDirectory%\" \"%SIPObjectsDirectory%\" \"%SIPDirectory%thumbnails/\"",
"execute": "remove_v0.0",
"filter_file_end": null,
"filter_file_start": null,
"filter_subdir": null,
"stderr_file": null,
"stdout_file": null
},
"description": {
"en": "Remove bagged files",
"pt_BR": "Remover pacotes de arquivos",
"sv": "Ta bort filer som blivit satta i en bag"
},
"exit_codes": {
"0": {
"job_status": "Completed successfully",
"link_id": "7c44c454-e3cc-43d4-abe0-885f93d693c6"
}
},
"fallback_job_status": "Failed",
"fallback_link_id": "7c44c454-e3cc-43d4-abe0-885f93d693c6",
"group": {
"en": "Prepare AIP",
"es": "Preparar AIP",
"fr": "Prรฉparer l'AIP",
"sv": "Fรถrbered AIP"
}
},
Thanks for following up. I've been remiss at checking back in on this issue, but I am also still seeing this with thumbnails and uncompressed AIPs in 1.9.1. The AIPs themselves seem to be exactly what I expect them to be, though I suppose since I've never had Archivematica create thumbnails, maybe I've never seen any other AIP structure. At a quick glance, the AIPs I see when downloading from the demo site look very much like the AIPs I get from our local production and testing instances.
In 1.3 and 1.4 the thumbnail generation seemed to be tied to service files, so SIPs without service files never got thumbnails, and I never saw errors related to the thumbnails. So this is not something I've investigated deeply before.
@sevein do you think this will still be an issue in qa/1.x, after the no-ops changes?
This issue is not appearing in 1.10.x. I think we can close it.
Hi @evelynPM you need a specific combination of options. Without using the processing configuration I have saved above, I believe you simply need:
I have left a transfer on the 1.10.1
test server today. This uses the DemoTransferCSV
set and you can inspect the configuration and the microservice failing for you to observe the behavior. I can recreate this on CentOS and in our Docker deploy.
NB. for other readers the links to the above services above will likely disappear in time.
I had a look on qa/1.x
and it seems this is still present, so as we were chatting about this yesterday, I was mis-remembering the impact of the no-op work.
A basic fix that might be enough to satisfy this issue should just see the microservice checking the existence of a path before trying to delete it: rm: cannot remove โ/var/archivematica/sharedDirectory/watchedDirectories/workFlowDecisions/compressionAIPDecisions/thumbnailTest-5cc5375f-0be1-4574-8419-7486a1ff2bb4/thumbnails/โ: No such file or directory
.
You're right, @ross-spencer I was able to re-create the issue on 1.10.x using no normalization / no thumbnails / uncompressed.
We also ran into this issue on Archivematica 1.9.2 with different setting: Create single SIP and continue processing No thumbnails Normalize for preservation and access
We used the following configuration:
<processingMCP>
<preconfiguredChoices>
<!-- Store DIP -->
<preconfiguredChoice>
<appliesTo>5e58066d-e113-4383-b20b-f301ed4d751c</appliesTo>
<goToChain>8d29eb3d-a8a8-4347-806e-3d8227ed44a1</goToChain>
</preconfiguredChoice>
<!-- Select compression level -->
<preconfiguredChoice>
<appliesTo>01c651cb-c174-4ba4-b985-1d87a44d6754</appliesTo>
<goToChain>414da421-b83f-4648-895f-a34840e3c3f5</goToChain>
</preconfiguredChoice>
<!-- Examine contents -->
<preconfiguredChoice>
<appliesTo>accea2bf-ba74-4a3a-bb97-614775c74459</appliesTo>
<goToChain>e0a39199-c62a-4a2f-98de-e9d1116460a8</goToChain>
</preconfiguredChoice>
<!-- Remove from quarantine after (days) -->
<preconfiguredChoice>
<appliesTo>19adb668-b19a-4fcb-8938-f49d7485eaf3</appliesTo>
<goToChain>333643b7-122a-4019-8bef-996443f3ecc5</goToChain>
<delay unitCtime="yes">2419200.0</delay>
</preconfiguredChoice>
<!-- Normalize (match 1 for "Normalize for preservation and access") -->
<preconfiguredChoice>
<appliesTo>cb8e5706-e73f-472f-ad9b-d1236af8095f</appliesTo>
<goToChain>b93cecd4-71f2-4e28-bc39-d32fd62c5a94</goToChain>
</preconfiguredChoice>
<!-- Bind PIDs -->
<preconfiguredChoice>
<appliesTo>05357876-a095-4c11-86b5-a7fff01af668</appliesTo>
<goToChain>fcfea449-158c-452c-a8ad-4ae009c4eaba</goToChain>
</preconfiguredChoice>
<!-- Create SIP(s) -->
<preconfiguredChoice>
<appliesTo>bb194013-597c-4e4a-8493-b36d190f8717</appliesTo>
<goToChain>61cfa825-120e-4b17-83e6-51a42b67d969</goToChain>
</preconfiguredChoice>
<!-- Delete packages after extraction -->
<preconfiguredChoice>
<appliesTo>f19926dd-8fb5-4c79-8ade-c83f61f55b40</appliesTo>
<goToChain>85b1e45d-8f98-4cae-8336-72f40e12cbef</goToChain>
</preconfiguredChoice>
<!-- Transcribe files (OCR) -->
<preconfiguredChoice>
<appliesTo>7079be6d-3a25-41e6-a481-cee5f352fe6e</appliesTo>
<goToChain>1170e555-cd4e-4b2f-a3d6-bfb09e8fcc53</goToChain>
</preconfiguredChoice>
<!-- Store DIP location -->
<preconfiguredChoice>
<appliesTo>cd844b6e-ab3c-4bc6-b34f-7103f88715de</appliesTo>
<goToChain>/api/v2/location/df9f97df-0fd2-47cb-9862-ad6ec78058b9/</goToChain>
</preconfiguredChoice>
<!-- Generate transfer structure report -->
<preconfiguredChoice>
<appliesTo>56eebd45-5600-4768-a8c2-ec0114555a3d</appliesTo>
<goToChain>e9eaef1e-c2e0-4e3b-b942-bfb537162795</goToChain>
</preconfiguredChoice>
<!-- Perform policy checks on originals -->
<preconfiguredChoice>
<appliesTo>70fc7040-d4fb-4d19-a0e6-792387ca1006</appliesTo>
<goToChain>3e891cc4-39d2-4989-a001-5107a009a223</goToChain>
</preconfiguredChoice>
<!-- Reminder: add metadata if desired -->
<preconfiguredChoice>
<appliesTo>eeb23509-57e2-4529-8857-9d62525db048</appliesTo>
<goToChain>5727faac-88af-40e8-8c10-268644b0142d</goToChain>
</preconfiguredChoice>
<!-- Store AIP -->
<preconfiguredChoice>
<appliesTo>2d32235c-02d4-4686-88a6-96f4d6c7b1c3</appliesTo>
<goToChain>9efab23c-31dc-4cbd-a39d-bb1665460cbe</goToChain>
</preconfiguredChoice>
<!-- Perform policy checks on access derivatives -->
<preconfiguredChoice>
<appliesTo>8ce07e94-6130-4987-96f0-2399ad45c5c2</appliesTo>
<goToChain>76befd52-14c3-44f9-838f-15a4e01624b0</goToChain>
</preconfiguredChoice>
<!-- Perform file format identification (Ingest) -->
<preconfiguredChoice>
<appliesTo>7a024896-c4f7-4808-a240-44c87c762bc5</appliesTo>
<goToChain>3c1faec7-7e1e-4cdd-b3bd-e2f05f4baa9b</goToChain>
</preconfiguredChoice>
<!-- Perform policy checks on preservation derivatives -->
<preconfiguredChoice>
<appliesTo>153c5f41-3cfb-47ba-9150-2dd44ebc27df</appliesTo>
<goToChain>b7ce05f0-9d94-4b3e-86cc-d4b2c6dba546</goToChain>
</preconfiguredChoice>
<!-- Assign UUIDs to directories -->
<preconfiguredChoice>
<appliesTo>bd899573-694e-4d33-8c9b-df0af802437d</appliesTo>
<goToChain>2dc3f487-e4b0-4e07-a4b3-6216ed24ca14</goToChain>
</preconfiguredChoice>
<!-- Document empty directories -->
<preconfiguredChoice>
<appliesTo>d0dfa5fc-e3c2-4638-9eda-f96eea1070e0</appliesTo>
<goToChain>29881c21-3548-454a-9637-ebc5fd46aee0</goToChain>
</preconfiguredChoice>
<!-- Send transfer to quarantine -->
<preconfiguredChoice>
<appliesTo>755b4177-c587-41a7-8c52-015277568302</appliesTo>
<goToChain>d4404ab1-dc7f-4e9e-b1f8-aa861e766b8e</goToChain>
</preconfiguredChoice>
<!-- Extract packages -->
<preconfiguredChoice>
<appliesTo>dec97e3c-5598-4b99-b26e-f87a435a6b7f</appliesTo>
<goToChain>01d80b27-4ad1-4bd1-8f8d-f819f18bf685</goToChain>
</preconfiguredChoice>
<!-- Approve normalization -->
<preconfiguredChoice>
<appliesTo>de909a42-c5b5-46e1-9985-c031b50e9d30</appliesTo>
<goToChain>1e0df175-d56d-450d-8bee-7df1dc7ae815</goToChain>
</preconfiguredChoice>
<!-- Upload DIP -->
<preconfiguredChoice>
<appliesTo>92879a29-45bf-4f0b-ac43-e64474f0f2f9</appliesTo>
<goToChain>6eb8ebe7-fab3-4e4c-b9d7-14de17625baa</goToChain>
</preconfiguredChoice>
</preconfiguredChoices>
</processingMCP>
There are currently two jobs for removing temporary bag directories when an AIP is being prepared to be stored:
removeDirectories_v0.0
oneremove_v0.0
client scriptNotice the job titles are slightly different.
The problem with remove_0.0
is that is based on the rm
command and it fails when a target argument doesn't exist (like when thumbnails are not generated but the thumbnails
directory is passed to the job). So, this draft PR replaces it in the first case with the Python based removeDirectories
which will just print a warning instead of failing the job. The job titles (descriptions in the workflow) have been synchronized also.
But I'm wondering if we should merge both jobs into a single one. I think they're intended to be doing the same task (remove temporary bag directories) and their arguments are very similar but not identical:
Arguments passed to remove_v0.0
:
\"%SIPLogsDirectory%\" \"%SIPObjectsDirectory%\" \"%SIPDirectory%thumbnails/\"
Arguments passed to removeDirectories_v0.0
:
"\"%SIPDirectory%%SIPName%-%SIPUUID%\" \"%SIPLogsDirectory%\" \"%SIPObjectsDirectory%\" \"%SIPDirectory%thumbnails/\""
@sromkey @sallain @ross-spencer @evelynPM any thoughts?
I have no concerns about combining the jobs into a single one if they're accomplishing the same thing - I'll just loop in @sevein in case there's an architectural reason not to.
To be honest I had no idea, but what you're suggesting Douglas makes total sense!
My idea didn't get far :sweat_smile: There's a reason for the arguments to be different: if I try to remove the \"%SIPDirectory%%SIPName%-%SIPUUID%\"
in an uncompressed AIP I am removing the AIP directory itself!
Sorry for the noise :blush:
@replaceafill :rofl:
Tested in qa/1.x (last commit: https://github.com/artefactual/archivematica/commit/6d84db86d7d8fa21cea6f79f4b2ef1ec4c9666ef). Glad to see this one fixed because it's come up a few times!
Expected behaviour Choosing not to generate thumbnails shouldn't cause "Remove bagged files" to report a failure.
Current behaviour If you choose not to create thumbnails, the "Remove bagged files" micro-service reports a failure. The stderr for this micro-service indicates the failure is because the "thumbnails" directory does not exist and so can't be removed.
Steps to reproduce Generate any ingest without thumbnails and it will show a failure at "Remove bagged files".
Your environment (version of Archivematica, OS version, etc) Ubuntu 18.04, AM 1.8, AM 1.9
For Artefactual use: Please make sure these steps are taken before moving this issue from Review to Verified in Waffle: