IQSS / dataverse

Open source research data repository software
http://dataverse.org
Other
876 stars 484 forks source link

Dataset Page: If json-ld export is invalid, prevents dataset page from loading #5489

Closed kcondon closed 5 years ago

kcondon commented 5 years ago

Reported by a user, see 271620. When loading their dataset, throws 500 error. Dataset has 3 published versions, 1.0, 1.1, 1.2. v1.0, 1.1 are accessible v1.2 throws error. Versions tab for accessible versions shows diff with v1.2 on keyword field but not apparent what change is outside of some leading/trailing whitespace. Keyword field is large with lots of html links. Stack trace seems to indicate jason-ld export is failing and since that output is part of the page and latest published version attempts to export if not there (we think) that might account for the issue. Stack trace indicted problem with export was due to compound contributor field with type selected (funder) but family name not selected and saved. This can be seen in current version by saving and publishing failing due to failing export.

We're still working on a workaround but poking a dummy and a valid jason-ld file does not fix it. Here is original stack trace:

[2019-01-25T12:16:01.653-0500] [glassfish 4.1] [SEVERE] [] [javax.enterprise.resource.webcontainer.jsf.application] [tid: _ThreadID=251 _ThreadName=jk-connector(33)] [timeMillis: 1548436561653] [levelValue: 1000] [[

Error Rendering View[/dataset.xhtml]

javax.el.ELException: /dataset.xhtml @39,66 value="#{DatasetPage.jsonLd}": java.lang.NullPointerException: Value in JsonObjects name/value pair cannot be null

at com.sun.faces.facelets.el.TagValueExpression.getValue(TagValueExpression.java:114)

at javax.faces.component.ComponentStateHelper.eval(ComponentStateHelper.java:194)

at javax.faces.component.ComponentStateHelper.eval(ComponentStateHelper.java:182)

at javax.faces.component.UIOutput.getValue(UIOutput.java:174)

at com.sun.faces.renderkit.html_basic.HtmlBasicInputRenderer.getValue(HtmlBasicInputRenderer.java:205)

at com.sun.faces.renderkit.html_basic.HtmlBasicRenderer.getCurrentValue(HtmlBasicRenderer.java:355)

at com.sun.faces.renderkit.html_basic.HtmlBasicRenderer.encodeEnd(HtmlBasicRenderer.java:164)

at javax.faces.component.UIComponentBase.encodeEnd(UIComponentBase.java:919)

at javax.faces.component.UIComponent.encodeAll(UIComponent.java:1863)

at javax.faces.component.UIComponent.encodeAll(UIComponent.java:1859)

at org.primefaces.renderkit.HeadRenderer.encodeBegin(HeadRenderer.java:62)

at javax.faces.component.UIComponentBase.encodeBegin(UIComponentBase.java:864)

at javax.faces.component.UIComponent.encodeAll(UIComponent.java:1854)

at javax.faces.component.UIComponent.encodeAll(UIComponent.java:1859)

at com.sun.faces.application.view.FaceletViewHandlingStrategy.renderView(FaceletViewHandlingStrategy.java:456)

at com.sun.faces.application.view.MultiViewHandler.renderView(MultiViewHandler.java:133)

at javax.faces.application.ViewHandlerWrapper.renderView(ViewHandlerWrapper.java:337)

at org.ocpsoft.rewrite.faces.RewriteViewHandler.renderView(RewriteViewHandler.java:196)

at javax.faces.application.ViewHandlerWrapper.renderView(ViewHandlerWrapper.java:337)

at com.sun.faces.lifecycle.RenderResponsePhase.execute(RenderResponsePhase.java:120)

at com.sun.faces.lifecycle.Phase.doPhase(Phase.java:101)

at com.sun.faces.lifecycle.LifecycleImpl.render(LifecycleImpl.java:219)

at javax.faces.webapp.FacesServlet.service(FacesServlet.java:647)

at org.apache.catalina.core.StandardWrapper.service(StandardWrapper.java:1682)

at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:344)

at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:214)

at org.glassfish.tyrus.servlet.TyrusServletFilter.doFilter(TyrusServletFilter.java:295)

at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:256)

at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:214)

at org.ocpsoft.rewrite.servlet.RewriteFilter.doFilter(RewriteFilter.java:226)

at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:256)

at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:214)

at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:316)

at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:160)

at org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:734)

at org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:673)

at com.sun.enterprise.web.WebPipeline.invoke(WebPipeline.java:99)

at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:174)

at org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:734)

at org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:673)

at org.apache.catalina.connector.CoyoteAdapter.doService(CoyoteAdapter.java:412)

at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:282)

at com.sun.enterprise.v3.services.impl.ContainerMapper$HttpHandlerCallable.call(ContainerMapper.java:459)

at com.sun.enterprise.v3.services.impl.ContainerMapper.service(ContainerMapper.java:167)

at org.glassfish.grizzly.http.server.HttpHandler.runService(HttpHandler.java:201)

at org.glassfish.grizzly.http.server.HttpHandler.doHandle(HttpHandler.java:175)

at org.glassfish.grizzly.http.server.HttpServerFilter.handleRead(HttpServerFilter.java:235)

at org.glassfish.grizzly.filterchain.ExecutorResolver$9.execute(ExecutorResolver.java:119)

at org.glassfish.grizzly.filterchain.DefaultFilterChain.executeFilter(DefaultFilterChain.java:284)

at org.glassfish.grizzly.filterchain.DefaultFilterChain.executeChainPart(DefaultFilterChain.java:201)

at org.glassfish.grizzly.filterchain.DefaultFilterChain.execute(DefaultFilterChain.java:133)

at org.glassfish.grizzly.filterchain.DefaultFilterChain.process(DefaultFilterChain.java:112)

at org.glassfish.grizzly.ProcessorExecutor.execute(ProcessorExecutor.java:77)

at org.glassfish.grizzly.nio.transport.TCPNIOTransport.fireIOEvent(TCPNIOTransport.java:561)

at org.glassfish.grizzly.strategies.AbstractIOStrategy.fireIOEvent(AbstractIOStrategy.java:112)

at org.glassfish.grizzly.strategies.WorkerThreadIOStrategy.run0(WorkerThreadIOStrategy.java:117)

at org.glassfish.grizzly.strategies.WorkerThreadIOStrategy.access$100(WorkerThreadIOStrategy.java:56)

at org.glassfish.grizzly.strategies.WorkerThreadIOStrategy$WorkerThreadRunnable.run(WorkerThreadIOStrategy.java:137)

at org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:565)

at org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.run(AbstractThreadPool.java:545)

at java.lang.Thread.run(Thread.java:748)

Caused by: javax.el.ELException: java.lang.NullPointerException: Value in JsonObjects name/value pair cannot be null

at javax.el.BeanELResolver.getValue(BeanELResolver.java:368)

at com.sun.faces.el.DemuxCompositeELResolver._getValue(DemuxCompositeELResolver.java:176)

at com.sun.faces.el.DemuxCompositeELResolver.getValue(DemuxCompositeELResolver.java:203)

at com.sun.el.parser.AstValue.getValue(AstValue.java:140)

at com.sun.el.parser.AstValue.getValue(AstValue.java:204)

at com.sun.el.ValueExpressionImpl.getValue(ValueExpressionImpl.java:226)

at org.jboss.weld.el.WeldValueExpression.getValue(WeldValueExpression.java:50)

at com.sun.faces.facelets.el.TagValueExpression.getValue(TagValueExpression.java:109)

... 60 more

Caused by: java.lang.NullPointerException: Value in JsonObjects name/value pair cannot be null

at org.glassfish.json.JsonObjectBuilderImpl.validateValue(JsonObjectBuilderImpl.java:164)

at org.glassfish.json.JsonObjectBuilderImpl.add(JsonObjectBuilderImpl.java:74)

at edu.harvard.iq.dataverse.DatasetVersion.getJsonLd(DatasetVersion.java:1743)

at edu.harvard.iq.dataverse.export.SchemaDotOrgExporter.exportDataset(SchemaDotOrgExporter.java:79)

at edu.harvard.iq.dataverse.export.ExportService.cacheExport(ExportService.java:276)

at edu.harvard.iq.dataverse.export.ExportService.exportFormat(ExportService.java:212)

at edu.harvard.iq.dataverse.export.ExportService.getExport(ExportService.java:99)

at edu.harvard.iq.dataverse.export.ExportService.getExportAsString(ExportService.java:118)

at edu.harvard.iq.dataverse.DatasetPage.getJsonLd(DatasetPage.java:4321)

at sun.reflect.GeneratedMethodAccessor1071.invoke(Unknown Source)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:498)

at javax.el.BeanELResolver.getValue(BeanELResolver.java:363)

... 67 more

]]

kcondon commented 5 years ago

There is another issue that led to this behavior: Contributor compound field allows a type without a name value but export json-ld does not and fails. So, two main issues here:

  1. Fix dataset page so if json-ld export fails it still loads.
  2. Make Export rules consistent with respect to compound fields, eg. Contributor type/name.
landreev commented 5 years ago

This is the promised one other bug I saw in that export: this is DataverseVersion.java, in the code that generates the "funder" portion of the schema.org export:

boolean addFunder = false;
                for (DatasetFieldCompoundValue contributorValue : dsf.getDatasetFieldCompoundValues()) {
                    String contributorName = null;
                    String contributorType = null;
                    for (DatasetField subField : contributorValue.getChildDatasetFields()) {
                        if (subField.getDatasetFieldType().getName().equals(DatasetFieldConstant.contributorName)) {
                            contributorName = subField.getDisplayValue();
                        }
                        if (subField.getDatasetFieldType().getName().equals(DatasetFieldConstant.contributorType)) {
                            contributorType = subField.getDisplayValue();
                            // TODO: Consider how this will work in French, Chinese, etc.
                            String funderString = "Funder";
                            if (funderString.equals(contributorType)) {
                                addFunder = true;
                            }
                        }
                    }
                    if (addFunder) {
                        retList.add(contributorName);
                    }
                }

the way the variable addFunder is set to true, it looks like once it finds one contributor field of type "Funder", it's going to assume that all the consecutive ones are funders too - right?

pdurbin commented 5 years ago

@landreev shoot. You're right. If I alter the order of contributors used in SchemaDotOrgExporterTest like this...

murphy:dataverse pdurbin$ git diff
diff --git a/src/test/resources/json/dataset-finch2.json b/src/test/resources/json/dataset-finch2.json
index b3c01eb3d..ffb95bb68 100644
--- a/src/test/resources/json/dataset-finch2.json
+++ b/src/test/resources/json/dataset-finch2.json
@@ -180,13 +180,13 @@
                   "typeName": "contributorType",
                   "multiple": false,
                   "typeClass": "controlledVocabulary",
-                  "value": "Data Collector"
+                  "value": "Funder"
                 },
                 "contributorName": {
                   "typeName": "contributorName",
                   "multiple": false,
                   "typeClass": "primitive",
-                  "value": "Holmes, Sherlock"
+                  "value": "National Science Foundation"
                 }
               },
               {
@@ -194,13 +194,13 @@
                   "typeName": "contributorType",
                   "multiple": false,
                   "typeClass": "controlledVocabulary",
-                  "value": "Funder"
+                  "value": "Data Collector"
                 },
                 "contributorName": {
                   "typeName": "contributorName",
                   "multiple": false,
                   "typeClass": "primitive",
-                  "value": "National Science Foundation"
+                  "value": "Holmes, Sherlock"
                 }
               }
             ]
murphy:dataverse pdurbin$ 

... Sherlock Holmes becomes a funder even though he's really a Data Collector:

  "funder": [
    {
      "@type": "Organization",
      "name": "National Science Foundation"
    },
    {
      "@type": "Organization",
      "name": "Holmes, Sherlock"
    },
    {
      "@type": "Organization",
      "name": "National Institutes of Health"
    }
  ],

When we fix this bug, we should modify the dataset-finch2.json file too, as above or similarly. For the curious, "National Institutes of Health" comes from the other way that funders are added (grantNumberAgency). Good catch!

@kcondon in sprint planning I was saying we should switch to Michael's NullSafeJsonBuilder on this line: JsonObjectBuilder funder = Json.createObjectBuilder();

jggautier commented 5 years ago

I opened an issue in RT (https://help.hmdc.harvard.edu/Ticket/Display.html?id=271888) about a different dataset that seems to be published (https://dataverse.harvard.edu/permissions-manage.xhtml?id=2710605), but when you try to view it you get a 500 error.

It's one of the few datasets in Harvard Dataverse saved with a Funder contributor type but no value. Not sure if this bug is what's causing this dataset to be unviewable. (The dataset has only one version.)