artefactual-sdps / preprocessing-sfa

preprocessing-sfa is an Enduro preprocessing workflow for SFA SIPs
1 stars 0 forks source link

Problem: "Verify SIP checksums" activity assumes checksums will be MD5 #88

Open sallain opened 3 days ago

sallain commented 3 days ago

Describe the bug

The checksum validation activity assumes that checksums in the provided metadata will be MD5. However, the eCH-0160 standard states that MD5, SHA-256, SHA-512, and SHA-1 are all permissible. Some of the sample packages that we have use SHA-256, and therefore fail on checksum validation.

To Reproduce

Steps to reproduce the behavior:

  1. Upload a sample package with non-MD5 checksums (e.g. GEVER)
  2. See it fail!

Expected behavior

The activity should support all valid checksum algorithms. In the metadata file, the checksum's algorithm is defined:

<pruefalgorithmus>MD5</pruefalgorithmus>
<pruefsumme>aa1f7ad13064e1643ac85478296165af</pruefsumme>
<pruefalgorithmus>SHA-256</pruefalgorithmus>
<pruefsumme>F4F4789FD15E5B9B19E0F2C2D640AC2BEA0690424FFD9258E6B35028A1981D0B</pruefsumme>

The activity should check the pruefalgorithmus tag to determine the type of checksum and check using the correct algorithm.

Screenshots

If applicable, add screenshots to help explain your problem.

Additional context

Add any other context about the problem here.