DILCISBoard / E-ARK-CSIP

E-ARK Common Specification for Information Packages
http://earkcsip.dilcis.eu
Creative Commons Attribution 4.0 International
11 stars 5 forks source link

CSIP101 & CSIP105 are confusing #698

Open nvanderperren opened 1 year ago

nvanderperren commented 1 year ago

Hi, I, find CSIP101 and CSIP105 confusing. I want to check if I understand it correctly

CSIP101: do you use this one if you don't have representation subdirectories in the representations directory? so, the representations directory is empty. (ergo, no representation METS is referenced in the filesec)

CSIP105: do you use this one if you have representations subdirectories in the representations directory? so, the representations directory is not empty (ergo, a representation METS is referenced in the filesec)

My colleagues implemented the structmap like this

<structMap ID="uuid-6b183791-bcf2-4491-913d-e3b553ef2b75" TYPE="PHYSICAL" LABEL="CSIP">
        <div ID="uuid-1dd9df64-94f5-46d4-9d76-3d09f2124412" LABEL="3d-example">
            <div ID="uuid-e9a09018-9c23-46c7-9768-aaf372fd33dc" LABEL="Metadata"
                ADMID="uuid-e2dcd7c5-5fad-4bcd-a7c7-762b0be75d0f"
                DMDID="uuid-3936403d-133f-4765-b3b9-0a46df28db17"
            />
            <div ID="uuid-fa1ecdd5-ecec-488d-baac-01025d996b77" LABEL="Representations"> <-- content division -->
                <div ID="uuid-B0D5E486-C582-41BC-BD2D-50543FC897C1" LABEL="representation_1"> <-- representation division -->
                    <mptr xlink:type="simple"
                        xlink:href="./representations/representation_1/mets.xml" LOCTYPE="URL" />
                </div>
                <div ID="uuid-FDF37384-FEB4-49BF-82EA-513F6185B89F" LABEL="representation_2">
                    <mptr xlink:type="simple"
                        xlink:href="./representations/representation_2/mets.xml" LOCTYPE="URL" />
                </div>
                <div ID="uuid-0AC25CA1-6802-413C-9E32-AAD6DA2550BF" LABEL="representation_3">
                    <mptr xlink:type="simple"
                        xlink:href="./representations/representation_3/mets.xml" LOCTYPE="URL" />
                </div>
                <div ID="uuid-6B54B923-2DA0-47DA-82FB-A67206D34BA7" LABEL="representation_4">
                    <mptr xlink:type="simple"
                        xlink:href="./representations/representation_4/mets.xml" LOCTYPE="URL" />
                </div>
            </div>
        </div>
    </structMap>

Did they implement it correctly, or should it be:

<mets:structMap ID="uuid-0a82fb06-5475-46c2-b041-be220edc3d42" TYPE="PHISICAL" LABEL="CSIP">
        <mets:div ID="uuid-b739018c-742a-42f8-8aec-60583321f507">
            <mets:div ID="uuid-1f9bedb1-9408-4534-965c-e00d0826f45d" LABEL="metadata" 
            ADMID="uuid-5e3efa30-21ea-474c-a63c-730ad11cba68" 
            DMDID="dmd1 dmd2" />
            <mets:div ID="uuid-fe4eec1f-0b39-431e-bd1a-35eacc24fe38" LABEL="representations/representation_1">
                <mets:mptr LOCTYPE="URL" xlink:type="simple" xlink:href="./representations/representation_1/METS.xml" xlink:title="TEXT SCRAPE" />
            </mets:div>
        </mets:div>
</mets:structMap>
karinbredenberg commented 1 year ago

Hej! Yes, when you have just one representation present, CSIP101 is used. And when you have more than one representation CSIP105 is used. So yes your example of should it be are correct, with one for each representation. This is because we are not nesting the div-elements.

We will make sure to address this in the guideline!

prettybits commented 1 year ago

I have to agree that as is these rules are somewhat confusing, if I understand it right effectively either CSIP101 or CSIP105 must be used, but the cardinalities themselves are unable to express this, correct? The crucial requirements described in the opening paragraphs of 5.3.6. help understand CSIP105 a bit better and expresses its conditionality, so one could possibly infer that CSIP101 would be the alternative. Maybe it would help to introduce the terms "Content division" vs. "Representation division" a bit more clearly there?

The description for CSIP101 says

When no representations are present the content referenced in the file section file group with @USE attribute value “Representations” is described in the structural map as a single sub division.

while CSIP114 specifies the USE attribute more specifically as mets/fileSec/fileGrp[@USE=[starts-with('Representations')]].

The "no representation" example in 7.1. Appendix A also has the related fileGrp with USE="Representations/Submission/Data" and the file linked to representations/submission/data/SIARD.xml.

When you say

Yes, when you have just one representation present, CSIP101 is used. And when you have more than one representation CSIP105 is used.

and talk about a "one representation" case when so far I was thinking "no representation" I admit I'm even more confused now. ;)

The way I currently assume all this is meant:

Does this sound about right? I'd be happy for any further clarifications.