isamplesorg / isamples_inabox

Provides functionality intermediate to a collection and central
0 stars 1 forks source link

OpenContext failing to reindex after latest content fetch and update to new metadata format #335

Closed dannymandel closed 9 months ago

dannymandel commented 10 months ago

After fetching the latest OpenContext records, I noticed that the indexer was failing with this exception:

2023-12-08 16:12:46.174 ERROR (qtp586358252-326) [isb_core_records_3 shard1 core_node2 isb_core_records_3_shard1_replica_n1] o.a.s.h.RequestHandlerBase Client exception => org.apache.solr.common.SolrException: Exception writing document id ark:/28722/k2tb16c3r to the index; possible analysis error.
    at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:335)
org.apache.solr.common.SolrException: Exception writing document id ark:/28722/k2tb16c3r to the index; possible analysis error.

digging in a bit, the root exception was actually:

Caused by: java.lang.ClassCastException: class java.lang.String cannot be cast to class org.apache.solr.common.SolrInputDocument (java.lang.String is in module java.base of loader 'bootstrap'; org.apache.solr.common.SolrInputDocument is in unnamed module of loader org.eclipse.jetty.webapp.WebAppClassLoader @120f38e6)
    at org.apache.solr.update.AddUpdateCommand.flattenLabelled(AddUpdateCommand.java:270) ~[?:?]
    at org.apache.solr.update.AddUpdateCommand.flatten(AddUpdateCommand.java:250) ~[?:?]

Many of the OpenContext documents worked just fine, however. On comparison of the broken document and the ones that worked, I noticed that the broken one had one of the complex keyword descriptions we implemented for Getty:

    "keywords":
    [
        {
            "id": "https://vocab.getty.edu/aat/300247919",
            "label": "fossils"
        },

While the ones that worked did not have this problem.

dannymandel commented 9 months ago

Fixed by https://github.com/isamplesorg/isamples_inabox/pull/336