kitodo / kitodo-production

Kitodo.Production is a workflow management tool for mass digitization and is part of the Kitodo Digital Library Suite.
http://www.kitodo.org/software/kitodoproduction/
GNU General Public License v3.0
64 stars 63 forks source link

Release 3.4.3: Some issues with meta data loading from catalog #5242

Closed stefanCCS closed 2 years ago

stefanCCS commented 2 years ago

It looks like that there are changes from Release 3.4.2 to 3.4.3 in catalog loading. There is this error: "InvalidMetadataValueException". What has changed here (this has worked in the past)?

image

In kitodo.log you can find this exception:

[ERROR] 2022-07-14 10:38:31.185 [http-nio-8080-exec-10] CatalogImportDialog - Cannot store "ContributorPerson » Relationship designation (code)": The value is invalid. Value: pbl
org.kitodo.exceptions.InvalidMetadataValueException: Cannot store "ContributorPerson » Relationship designation (code)": The value is invalid. Value: pbl
        at org.kitodo.production.forms.createprocess.ProcessSelectMetadata.getMetadata(ProcessSelectMetadata.java:148) ~[classes/:3.4.3]
        at org.kitodo.production.forms.createprocess.ProcessSelectMetadata.getMetadataWithFilledValues(ProcessSelectMetadata.java:125) ~[classes/:3.4.3]
        at org.kitodo.production.forms.createprocess.ProcessFieldedMetadata.preserve(ProcessFieldedMetadata.java:653) ~[classes/:3.4.3]
        at org.kitodo.production.forms.createprocess.ProcessFieldedMetadata.getMetadata(ProcessFieldedMetadata.java:549) ~[classes/:3.4.3]
        at org.kitodo.production.forms.createprocess.ProcessFieldedMetadata.getMetadataWithFilledValues(ProcessFieldedMetadata.java:539) ~[classes/:3.4.3]
        at org.kitodo.production.forms.createprocess.ProcessFieldedMetadata.preserve(ProcessFieldedMetadata.java:653) ~[classes/:3.4.3]
        at org.kitodo.production.services.data.ImportService.transformToProcessDetails(ImportService.java:999) ~[classes/:3.4.3]
        at org.kitodo.production.services.data.ImportService.addTitleAndTiffHeaderDataToTempProcess(ImportService.java:437) ~[classes/:3.4.3]
        at org.kitodo.production.services.data.ImportService.createTempProcessFromDocument(ImportService.java:427) ~[classes/:3.4.3]
        at org.kitodo.production.services.data.ImportService.importProcessAndReturnParentID(ImportService.java:450) ~[classes/:3.4.3]
        at org.kitodo.production.services.data.ImportService.importProcessHierarchy(ImportService.java:520) ~[classes/:3.4.3]
        at org.kitodo.production.forms.createprocess.CatalogImportDialog.getRecordHierarchy(CatalogImportDialog.java:170) ~[classes/:3.4.3]
        at org.kitodo.production.forms.createprocess.CatalogImportDialog.getRecordById(CatalogImportDialog.java:211) ~[classes/:3.4.3]
        at org.kitodo.production.forms.createprocess.CatalogImportDialog.search(CatalogImportDialog.java:113) ~[classes/:3.4.3]

Additional, I provide here the catalogRecord.xml and the internalRecord.xml.

catalog-logs.tar.zip

stefanCCS commented 2 years ago

Ok, found something: In my RuleSet the "RoleCode" in "ContributorPerson" does not have the value "pbl" defined. So far - so clear. BUT: Still there is change in behavior of the Software (change from 3.4.2 to 3.4.3). I assume in 3.4.2 this value was simply be ignored. Why now create an error?

stefanCCS commented 2 years ago

Well, if I repair (extend) my Rule Set. I get the next Exception: image

[ERROR] 2022-07-14 11:18:07.723 [http-nio-8080-exec-4] CatalogImportDialog - newProcess.docTypeMetadataMissing: CCS-RulesetV09-KITODOonly-rs
org.kitodo.exceptions.ProcessGenerationException: newProcess.docTypeMetadataMissing: CCS-RulesetV09-KITODOonly-rs
        at org.kitodo.production.helper.TempProcess.verifyDocType(TempProcess.java:202) ~[classes/:3.4.3]
        at org.kitodo.production.forms.createprocess.CreateProcessForm.fillCreateProcessForm(CreateProcessForm.java:675) ~[classes/:3.4.3]
        at org.kitodo.production.forms.createprocess.CatalogImportDialog.getRecordHierarchy(CatalogImportDialog.java:176) ~[classes/:3.4.3]
        at org.kitodo.production.forms.createprocess.CatalogImportDialog.getRecordById(CatalogImportDialog.java:211) ~[classes/:3.4.3]
        at org.kitodo.production.forms.createprocess.CatalogImportDialog.search(CatalogImportDialog.java:113) ~[classes/:3.4.3]
        at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?]
        at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:?]
        at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]

The docType is created/evaluated with this xslt:

    <xsl:template match="pica:record">
        <mets:mdWrap MDTYPE="PICAXML">
            <mets:xmlData>
                <kitodo:kitodo>
                    <xsl:apply-templates select="@*|node()"/>
                    <!-- ### DocType ### -->
                    <kitodo:metadata name="docType">
                        <xsl:variable name="status" select="pica:datafield[@tag='002@']/pica:subfield[@code='0']"/>
                        <xsl:variable name="genre" select="pica:datafield[@tag='013D']/pica:subfield[@code='a']"/>
                        <xsl:choose>
                            <xsl:when test="matches($status,'^[AO]a[acfgkmruvxy]')">
                                <xsl:choose>
                                    <xsl:when test="($genre='Handschrift')">        <!-- in all variants -->
                                        <xsl:text>Manuscript</xsl:text>
                                    </xsl:when>
                                    <xsl:when test="($genre='Musikhandschrift')">   <!-- in all variants -->
                                        <xsl:text>Manuscript</xsl:text>
                                    </xsl:when>
                                    <!-- "Graphics" taken out as no RuleSet-Definition found  --> 
                                    <!-- <xsl:when test="($genre='Bild')"> -->              <!-- in all but SLUB variants -->
                                        <!-- <xsl:text>Graphics</xsl:text> -->
                                    <!-- </xsl:when> -->
                                    <xsl:otherwise>                                 <!-- in all variants -->
                                        <xsl:text>Monograph</xsl:text>
                                    </xsl:otherwise>
                                </xsl:choose>

And you can see in internalRecord.xml that this is set: <kitodo:metadata name="docType">Monograph</kitodo:metadata>

So, what is happening here?

solth commented 2 years ago

Concerning the ProcessGenerationException with error message "newProcess.docTypeMetadataMissing" I think the reason is that we introduced a new "functional metadata" which is used to hold the "Document type" in #4906. This was done in order to fix #4863. I guess this change was not part of 3.4.2, but only in 3.4.3.

Before that change every ruleset would have to contain a key with the id "docType", but after the change the metadata for the document type can be (and in fact has to be) configured freely using the attribute use=docType in the ruleset.

So basically what you need to do is just replace

<key id="docType">

with

<key id="docType" use="docType">

in your ruleset of choice.

stefanCCS commented 2 years ago

Yes, that's it! Many thanks.

Coming back to my other question: How/why the behavior has changed, in case the catalog includes a value, which the RuleSet does not define (see example with "pbl" value for "RoleCode") ?

matthias-ronge commented 2 years ago

Checking for invalid values should have been in place since ever, but maybe it was missing or broken in the past and now it is fixed.

Complete your data type in the ruleset (add "pbl") to fix it. Or, change the data type to string (remove all <option> elements) if you want the field to accept any input.

stefanCCS commented 2 years ago

Understood - but I fear a bit in the future the catalog might get an extension and supports a new value for a field. Then the process generation fails, which might not be a good idea in a real production project. Therefore my questions: 1) Is it possible to define something like a wild card, e.g. ?

<option value="*">
    <label>Sonstiges</label>
</option>

2) Is it possible to ignore this error(=field) and the process generation is done without this field (and maybe create only a kind of Warning)?

matthias-ronge commented 2 years ago

It doesn’t make sense. The ruleset defines data types. Either, you have a data type that has a fixed set of values, or not. If there is only a limited set of option, only these values are allowed. In this case, use <option> elements to describe the allowed values. If randomly values may be added to the data type, then it isn’t a fixed type, but you need a character string type. Then remove all the <option> elements. But, to me it just looks like you had an incomplete list of relators in your ruleset. This list is maintained by the LOC and is used by many libraries around the world. it is at least very unlikely to change. Also, be aware of relators.xml file in the ruleset directory, which is a representation of this list.

Still there is change in behavior of the Software (change from 3.4.2 to 3.4.3).

Yes, a lot changed, see pull requests merged from April 12th to May 30th, 2022. See also the release notes

stefanCCS commented 2 years ago

Ok, many thanks for your explanation, especially for the hint to relators.xml

And concerning the Release Notes and pull request: Well, this in not very helpful (at least for me as User, maybe Power User):

stefanCCS commented 2 years ago

I will close this issue here now (could get everything to run).