kitodo / kitodo-production

Kitodo.Production is a workflow management tool for mass digitization and is part of the Kitodo Digital Library Suite.
http://www.kitodo.org/software/kitodoproduction/
GNU General Public License v3.0
58 stars 65 forks source link

NPE during import when import mapping and ruleset do not match #5217

Open solth opened 2 years ago

solth commented 2 years ago

I got a NullPointerException when trying to import some data using a mapping file that creates metadata groups and metadata entries within the group that are both unknown to the ruleset:

Bildschirmfoto 2022-07-05 um 09 52 00
markusweigelt commented 2 years ago

Same problem last week but i did not report as issue. :sweat:

Problem seams the IllegalStateException in ProcessFieldedMetadata line 346. In my case this exception is more meaningful. "Got complex metadata entry with key "ContributorCorporation" which isn't declared as substructured key in the rule set." So we should log this exception, improve or reconsider error handling. Cause after this exception is catched a fallback mechanism handle missing field using metadata.createUndefinedMetadataTable and null is passed as parameter which then leads to the npe.

java.lang.NullPointerException
    at org.kitodo.production.forms.createprocess.ProcessTextMetadata.addLeadingZeros(ProcessTextMetadata.java:48)
    at org.kitodo.production.forms.createprocess.ProcessTextMetadata.<init>(ProcessTextMetadata.java:39)
    at org.kitodo.production.forms.createprocess.ProcessFieldedMetadata.createMetadataEntryEdit(ProcessFieldedMetadata.java:340)
    at org.kitodo.production.forms.createprocess.ProcessFieldedMetadata.createUndefinedMetadataTable(ProcessFieldedMetadata.java:223)
    at org.kitodo.production.forms.createprocess.ProcessFieldedMetadata.createMetadataEntryEdit(ProcessFieldedMetadata.java:350)
    at org.kitodo.production.forms.createprocess.ProcessFieldedMetadata.createMetadataTable(ProcessFieldedMetadata.java:198)
    at org.kitodo.production.forms.createprocess.ProcessFieldedMetadata.<init>(ProcessFieldedMetadata.java:123)
    at org.kitodo.production.services.data.ImportService.initializeProcessDetails(ImportService.java:1049)
    at org.kitodo.production.services.data.ImportService.transformToProcessDetails(ImportService.java:997)
    at org.kitodo.production.services.data.ImportService.addTitleAndTiffHeaderDataToTempProcess(ImportService.java:437)
    at org.kitodo.production.services.data.ImportService.createTempProcessFromDocument(ImportService.java:427)
    at org.kitodo.production.services.data.ImportService.importProcessAndReturnParentID(ImportService.java:450)
    at org.kitodo.production.services.data.ImportService.importProcessHierarchy(ImportService.java:520)
    at org.kitodo.production.forms.createprocess.CatalogImportDialog.getRecordHierarchy(CatalogImportDialog.java:170)
    at org.kitodo.production.forms.createprocess.CatalogImportDialog.getRecordById(CatalogImportDialog.java:211)
    at org.kitodo.production.forms.createprocess.CatalogImportDialog.search(CatalogImportDialog.java:113)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
    at java.base/java.lang.reflect.Method.invoke(Unknown Source)
    at org.apache.el.parser.AstValue.invoke(AstValue.java:252)
    at org.apache.el.MethodExpressionImpl.invoke(MethodExpressionImpl.java:266)
    at org.jboss.weld.util.el.ForwardingMethodExpression.invoke(ForwardingMethodExpression.java:40)
    at org.jboss.weld.el.WeldMethodExpression.invoke(WeldMethodExpression.java:50)
    at org.jboss.weld.util.el.ForwardingMethodExpression.invoke(ForwardingMethodExpression.java:40)
    at org.jboss.weld.el.WeldMethodExpression.invoke(WeldMethodExpression.java:50)
    at org.apache.myfaces.view.facelets.el.ContextAwareTagMethodExpression.invoke(ContextAwareTagMethodExpression.java:96)
    at org.apache.myfaces.application.ActionListenerImpl.processAction(ActionListenerImpl.java:74)
    at org.springframework.faces.webflow.FlowActionListener.processAction(FlowActionListener.java:71)
    at org.springframework.faces.model.SelectionTrackingActionListener.processAction(SelectionTrackingActionListener.java:64)
    at javax.faces.component.UICommand.broadcast(UICommand.java:120)
    at javax.faces.component.UIViewRoot._broadcastAll(UIViewRoot.java:1255)
    at javax.faces.component.UIViewRoot.broadcastEvents(UIViewRoot.java:420)
    at javax.faces.component.UIViewRoot._process(UIViewRoot.java:1741)
    at javax.faces.component.UIViewRoot.processApplication(UIViewRoot.java:935)
    at org.apache.myfaces.lifecycle.InvokeApplicationExecutor.execute(InvokeApplicationExecutor.java:42)
    at org.apache.myfaces.lifecycle.LifecycleImpl.executePhase(LifecycleImpl.java:195)
    at org.apache.myfaces.lifecycle.LifecycleImpl.execute(LifecycleImpl.java:142)
    at javax.faces.webapp.FacesServlet.service(FacesServlet.java:204)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:227)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
    at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:53)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
    at org.kitodo.production.servletfilter.EncodingFilter.doFilter(EncodingFilter.java:69)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
    at org.omnifaces.filter.GzipResponseFilter.doFilter(GzipResponseFilter.java:181)
    at org.omnifaces.filter.HttpFilter.doFilter(HttpFilter.java:108)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)
    at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:327)
    at org.kitodo.production.security.SecurityObjectAccessFilter.doFilter(SecurityObjectAccessFilter.java:86)
    at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:336)
    at org.springframework.security.web.access.intercept.FilterSecurityInterceptor.invoke(FilterSecurityInterceptor.java:115)
    at org.springframework.security.web.access.intercept.FilterSecurityInterceptor.doFilter(FilterSecurityInterceptor.java:81)
    at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:336)
    at org.springframework.security.web.access.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:122)
    at org.springframework.security.web.access.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:116)
    at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:336)
    at org.springframework.security.web.session.SessionManagementFilter.doFilter(SessionManagementFilter.java:126)
    at org.springframework.security.web.session.SessionManagementFilter.doFilter(SessionManagementFilter.java:81)
    at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:336)
    at org.springframework.security.web.authentication.AnonymousAuthenticationFilter.doFilter(AnonymousAuthenticationFilter.java:109)
    at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:336)
    at org.springframework.security.web.servletapi.SecurityContextHolderAwareRequestFilter.doFilter(SecurityContextHolderAwareRequestFilter.java:149)
    at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:336)
    at org.springframework.security.web.savedrequest.RequestCacheAwareFilter.doFilter(RequestCacheAwareFilter.java:63)
    at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:336)
    at org.springframework.security.web.session.ConcurrentSessionFilter.doFilter(ConcurrentSessionFilter.java:147)
    at org.springframework.security.web.session.ConcurrentSessionFilter.doFilter(ConcurrentSessionFilter.java:125)
    at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:336)
    at org.springframework.security.web.authentication.AbstractAuthenticationProcessingFilter.doFilter(AbstractAuthenticationProcessingFilter.java:223)
    at org.springframework.security.web.authentication.AbstractAuthenticationProcessingFilter.doFilter(AbstractAuthenticationProcessingFilter.java:217)
    at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:336)
    at org.springframework.security.web.authentication.logout.LogoutFilter.doFilter(LogoutFilter.java:103)
    at org.springframework.security.web.authentication.logout.LogoutFilter.doFilter(LogoutFilter.java:89)
    at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:336)
    at org.springframework.security.web.header.HeaderWriterFilter.doHeadersAfter(HeaderWriterFilter.java:90)
    at org.springframework.security.web.header.HeaderWriterFilter.doFilterInternal(HeaderWriterFilter.java:75)
    at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:117)
    at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:336)
    at org.springframework.security.web.context.SecurityContextPersistenceFilter.doFilter(SecurityContextPersistenceFilter.java:112)
    at org.springframework.security.web.context.SecurityContextPersistenceFilter.doFilter(SecurityContextPersistenceFilter.java:82)
    at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:336)
    at org.springframework.security.web.context.request.async.WebAsyncManagerIntegrationFilter.doFilterInternal(WebAsyncManagerIntegrationFilter.java:55)
    at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:117)
    at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:336)
    at org.springframework.security.web.session.DisableEncodeUrlFilter.doFilterInternal(DisableEncodeUrlFilter.java:42)
    at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:117)
    at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:336)
    at org.springframework.security.web.FilterChainProxy.doFilterInternal(FilterChainProxy.java:211)
    at org.springframework.security.web.FilterChainProxy.doFilter(FilterChainProxy.java:183)
    at org.springframework.web.filter.DelegatingFilterProxy.invokeDelegate(DelegatingFilterProxy.java:354)
    at org.springframework.web.filter.DelegatingFilterProxy.doFilter(DelegatingFilterProxy.java:267)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)
    at org.apache.catalina.core.ApplicationFilterChain.…
matthias-ronge commented 2 years ago

Root cause is #3762.

henning-gerhardt commented 2 years ago

@matthias-ronge : A 2 years old issue - which is referring to the meta data editor and not to the import dialog - is the root cause for a new introduced bug caused by latest changes in this area? This did work before latest changes by you was merged or at least you get an information about this not declared meta data fields on import.

matthias-ronge commented 2 years ago

Yes, the root cause is in that issue. The error just moved to a different place where it shows. The problem is that the ruleset module isn’t yet capable to detect if an undefined metadata entry is a group.

I added a better explanation of the problem in the issue description a minute ago.

solth commented 2 years ago

Would it help to just check whether super.settings is null in line 48 before proceeding with addLeadingZeros as preliminary workaround/hotfix? Something like

private String addLeadingZeros(String value) {
    if (Objects.nonNull(super.settings) && Objects.equals(super.settings.getInputType(), InputType.INTEGER)) {
        int valueLength = value.length();
        int minDigits = super.settings.getMinDigits();
        return valueLength >= minDigits ? value : "0".repeat(minDigits - valueLength).concat(value);
    } else {
        return value;
    }
}

I understand this wouldn't solve the underlying problem but at least the NPE during import could be prevented in cases that might occur pretty often.

solth commented 2 years ago

Ok, I see now that there are many more parts in the code where subsequent exceptions would have to be handled, so fixing the core issue might be more feasible.

solth commented 1 year ago

This seems to create a lot more trouble than expected, so the "blocking" label is warranted, I think.

solth commented 1 year ago

The NPE is now handled in the code and shows an error message instead of an error page with stack trace, so the blocking label can be removed.

BartChris commented 1 year ago

You still receive a Nullpointer-Exception when an IllegalStateException in this block is catched and the in the catch block is executed:

https://github.com/kitodo/kitodo-production/blob/27df5aa60e841d7d0a3d864970d762d008a0b68f/Kitodo/src/main/java/org/kitodo/production/forms/createprocess/ProcessFieldedMetadata.java#L394-L405

In my case i got there when i search in the K10-Plus by title and then select a record.

The problem is that the IllegalStateException is catched; the IllegalStateException should just be thrown so that error can bubble up.

When doing that the "real" exception is shown Got complex metadata entry with key "Person" which i sn't declared as substructured key in the rule set.

In addition to that the Catalog Exception has to be catched when getSelectedRecord() is called because getRecordById might throw a CatalogException:

https://github.com/kitodo/kitodo-production/blob/cdf3ef01ccca51a696109da32fa3ba3aca036c31/Kitodo/src/main/java/org/kitodo/production/forms/createprocess/CatalogImportDialog.java#L81-L83

   public void getSelectedRecord() {
        try {
            getRecordById(Helper.getRequestParameter(ID_PARAMETER_NAME));
        } catch (CatalogException e) {
            this.opacErrorMessage = e.getMessage();
            PrimeFaces.current().ajax().update("opacErrorDialog");
            PrimeFaces.current().executeScript("PF('opacErrorDialog').show();");
        }
    }

This prevents the Exception and prints an error message.

BartChris commented 1 year ago

In my example the metadata entry from K10plus is returning a complex "person" key. There is no "person" defined in my ruleset, so the oneValue() method for normal text content in createMetadataEntryEdit() is called and throwing an IllegalStateException because a person cannot be mapped to normal text.

https://github.com/kitodo/kitodo-production/blob/27df5aa60e841d7d0a3d864970d762d008a0b68f/Kitodo/src/main/java/org/kitodo/production/forms/createprocess/ProcessFieldedMetadata.java#L392

https://github.com/kitodo/kitodo-production/blob/27df5aa60e841d7d0a3d864970d762d008a0b68f/Kitodo/src/main/java/org/kitodo/production/forms/createprocess/ProcessFieldedMetadata.java#L472

This is catched in the createMetadataEntryEdit() method which leads the code in the catch-clause getting executed and prodocuing the NPE. Removing the catch block there prevents that. I am however too unfamiliar with the exact purpose of the catch block (is it needed in specific scearios?) to propose a pull request here.

matthias-ronge commented 1 year ago

A stack trace showing is bad, but:

It is still an operators’ mistake if a key—including a grouped key—is not defined in the ruleset. You should not edit any data that you don’t know it exists. Especially, metadata groups shouldn’t appear so randomly in your data that you cannot predict and define them in the ruleset.

Production tries to handle this issue some way better then in version 2 and doesn’t crash anymore for unknown simple keys, but for historic reasons, it had to guess for a key if it is grouped or not before it knew the data, and so assuming “it is not” was the better trade-off.


General explaination (taken from #3762, which is a different exception, but caused by the very same problem): If the ruleset module gets a key ID that isn’t defined in the ruleset, it considers it as a text metadata. This goes back to days where the reading was that the modules should be independent of each other, and thus the ruleset module only gets the ID of a key, but didn’t have its class on its classpath, and so couldn’t check if it is complex. So, it returns a text box to be rendered, and the JSP fails to put a metadata group into the text string of the text box.

The assumption behind this behaviour was that, if ever there is an entry in the metadata which is missing in the ruleset, this is likely to be a value; because metadata groups are unlikely to appear “by accident”, but maybe an entry does, if it was removed from the ruleset in belief that it had never been used, but it was used in one single process that the editor of the ruleset didn’t know about. Unlike 2.x, this shouldn’t crash the metadata editor anymore, which it did in thie past.

BartChris commented 1 year ago

A stack trace showing is bad, but:

It is still an operators’ mistake if a key—including a grouped key—is not defined in the ruleset. You should not edit any data that you don’t know it exists. Especially, metadata groups shouldn’t appear so randomly in your data that you cannot predict and define them in the ruleset.

I have a more general question because i just stumbled over that: Apparently it is not considered a violation of the ruleset if a key is set inside a (root) division (e.g. inside 'monograph') by import from a catalog source although it is not defined in the list of allowed keys for that division.

If a) This key is defined in the ruleset, but not marked as allowed for monographs and only for periodicals for example no indication is given that this key is not in the list of allowed keys for that division (monograph). There is also no validation warning or error. The user can just save the process without any indication that there might be a key which is not explicitely allowed by the ruleset.

If b) The key is not defined in the ruleset the user gets at least an indication, that this key is not defined in the ruleset at all. No validation warning or error here.

I am just wondering: Is this intended behaviour? It is clear that there should be no crash of the application when data not defined in the ruleset is imported. But should this data not be considered as invalid by the application? cc: @andre-hohmann Edit: this Topic was also covered in this issue: https://github.com/kitodo/kitodo-production/issues/4953 But with no concluding answer

matthias-ronge commented 1 year ago

I think the case is simply not implemented. I think the XSLT should distinguish by doctype in such cases and only map the value when it is desired. It would be inconvenient for the user to have to manually delete the value for each import. But that's just my opinion. If you want this functionality, you can open a ticket for it.

Question: Does this have to do with the bug described in this ticket, i.e. throwing a null pointer exception?

BartChris commented 1 year ago

Thank you. It was not only about this specific behaviour, but the whole intended behaviour around undefined fields seemed underspecified/undecided to me, so i was looking for clarification. I will check wether to open an issue or maybe a discussion as well.

andre-hohmann commented 1 year ago

@BartChris : I think a discussion is appropriate to work out the issue. Perhaps we need to consider two things

1 The behavior when importing processes with undefined elements

2 The validation during import in general

solth commented 1 year ago

@BartChris & @andre-hohmann I agree that we should discuss that question in a dedicated discussion instead of an issue!

@matthias-ronge while I agree that it's the operators or administrators fault if a ruleset and accompanying XSLT mapping are not configured correctly, this is an error that's unfortunately all to easy to make. In my opinion, the system should be sturdier when it comes to such configuration problems and offer some kind of graceful degradation.

Displaying an error dialog about problems with the configuration instead of the unfiltered stack trace would already be a significant step up and would allow the user to at least continue using the current page (instead of dealing with a seemingly broken system). For configuration debugging purposes the stack trace could still be written to the log files (with a note in the popup dialog informing the user that the log files might contain additional information), but displaying it to the user will not do any good.

solth commented 3 months ago

Votes: 0