artefactual-sdps / enduro

A tool to support ingest and automation in digital preservation workflows
https://enduro.readthedocs.io/
Apache License 2.0
4 stars 3 forks source link

Problem: it is hard to isolate failed packages for review #929

Closed sallain closed 1 week ago

sallain commented 11 months ago

Is your feature request related to a problem? Please describe.

When transfers or SIPs fail, the package remains in its current location. If there are many packages, it takes time to locate the failure. Moving the package to a separate location makes it easier for users to see which packages have failed so that they can be fixed.

Describe the solution you'd like

When a transfer fails in pre-processing, move it to a "failed transfer" location (name tbd.)

When a SIP fails during ingest, in either Archivematica or a3m, move it to a "failed SIP" location (name tbd).

The workflow should look something like this, with details to be discussed:

image

Some sort of user notification should also be present in Enduro (maybe this is a second phase of work) - even just adding the path to the Location column on the Packages tab in Enduro would be helpful.

Describe alternatives you've considered

None

Additional context

Add any other context or screenshots about the feature request here.

DanielCosme commented 11 months ago

This is now working on this MR: https://github.com/artefactual-sdps/enduro-sfa/pull/2
More specifically this two commits: https://github.com/artefactual-sdps/enduro-sfa/pull/2/commits/ff8c353f63800f2d04b4c5a92b72341ed6a11a27 https://github.com/artefactual-sdps/enduro-sfa/pull/2/commits/3823cf020b69d4fdc641a5d0c6fab193285348b1

When the transfer fails at any point in the pre-processing step the original transfer will be copied to a failed-transfers bucket. If the SIP is created correctly but it fails at any point in the ingest process it will be sent to a failed-sips bucket.

image

sallain commented 11 months ago

@DanielCosme you can use github keywords to link issues and PRs - note that using fixes or closes will close the issue when the PR is merged, which is kinda nice.

jraddaoui commented 11 months ago

@sallain A few notes/questions about this issue:

sallain commented 11 months ago

@jraddaoui responses!

  1. Yeah, moving it probably makes sense.
  2. The failed transfers bucket gets the original package. The failed SIPs bucket gets the SIP as it was transformed by pre-processing (sort of an interim-state package, but I think that's the best option since it contains all the original material plus anything that was [or will be] created in pre-processing)
  3. Yes, the a3m/Archivematica thing still has some holes. I can create an issue linked to this one that describes this.
jraddaoui commented 11 months ago

Thanks @sallain:

  1. Trying to transfer the issue, doesn't seem to find the SFA for repository, let me know if you have other ideas to move it, otherwise I'll copy the descriptions and Daniel's update and create a new one there.
  2. Sounds good!
  3. If it's only for the SFA fork we should be good. But I think we should consider the failed SIP part something to add on main Enduro, looking at ways to not mix pre-processing and preservation tasks, specially if we move to the child workflow idea in the future, as this will be outside of the pre-processing context/domain.
sallain commented 8 months ago

@DanielCosme is this issue closed?

sallain commented 6 months ago

Note: this work is on a branch that has been deployed for testing. Issue will remain open until merged into main.

jraddaoui commented 2 months ago

I have updated the WIP branch with:

jraddaoui commented 2 weeks ago

Initial implementation has been merged in main. Failed buckets are set up through configuration, we may revisit this in the future to see if we move it to a new type(s) of location in the storage service layer.

sallain commented 1 week ago

Looks like this is working as expected!