keeps / commons-ip

Commons IP is project that provide a command-line tool and Java Library to validate and manipulate E-ARK Information Packages, so to create or process E-ARK SIP and AIP and also validate them against official specifications.
http://keeps.github.io/commons-ip/
GNU Lesser General Public License v3.0
11 stars 14 forks source link

I can't process SIP in RODA when generating SIP with common-ip #179

Closed jalshorji closed 1 year ago

jalshorji commented 1 year ago

When using this example https://github.com/keeps/commons-ip#write-some-code I get this error in RODA

`Is the package valid? no INFO METS.xml Main METS.xml file was found. ERROR METS.xml Main METS.xml file is not valid. javax.xml.bind.UnmarshalException

However, I can see that OAISPACKAGETYPE exists in metsHdr

I am running roda locally on docker, but I get the same error when running roda on https://demo.roda-community.org/.

SIP created with roda-in works fine.

This is the generated SIP sip-id.zip

prettybits commented 1 year ago

It looks like the METS schema file copied into the schemas subfolder by commons-ip is actually a locally modified version of the official METS 1.12 schema (the original schema resource is also present in the repo but not currently used). The modifications add the (C)SIP attributes to their respective elements using their prefixed form. ~Since the METS schema globally defines attributeFormDefault="unqualified" that to my understanding is why schema validation fails, expecting the attributes to be unqualified in the METS instance as well, which they can't be.~ (EDIT: this was wrong, see my comment below)

But besides that I think when creating a package the original METS schema file should actually be copied.

luis100 commented 1 year ago

@jalshorji it seems that you are creating an EARK SIP version 2 but then in RODA you are selecting an E-ARK SIP version 1.

Be sure to select the correct option here: image

jalshorji commented 1 year ago

The answer from @prettybits solved the problem. Changing to E-ARK SIP 1.0 didn't help either.

prettybits commented 1 year ago

Glad it helped, although my analysis was actually wrong regarding the prefix behaviour. Your package doesn't validate in current commons-ip because it uses the https://dilcis.eu/XML/METS/SIPExtensionMETS namespace but validation is done against the locally modified schema, not the one included in the schemas subfolder, which has a target namespace of https://DILCIS.eu/XML/METS/SIPExtensionMETS since 6ef8ed2977a4fb339c244d1779d0240e27a13d11, i.e. already since v2.0.0 (notice the casing, namespaces are case-sensitive).

Not sure how exactly you created the linked package and how that went wrong.