Closed thatbudakguy closed 4 months ago
Here's an example of what you see when indexing. I modified a document to have some bad metadata as an example.
W, [2024-02-23T14:18:28.933869 #41378] WARN -- GeoCombine: SOLR_URL not set; using Blacklight default
I, [2024-02-23T14:18:28.936742 #41378] INFO -- GeoCombine: indexing into http://localhost:8983/solr/blacklight-core
I, [2024-02-23T14:18:28.936772 #41378] INFO -- GeoCombine: loading documents from tmp/opengeometadata
E, [2024-02-23T14:19:01.621359 #41378] ERROR -- GeoCombine: error indexing batch (100 docs): 400 Bad Request - ERROR: [doc=VAC3073-M-01154] Error adding field 'solr_geom'='fdjlskfjdlkjfw' msg=Unable to parse shape given formats "lat,lon", "x y" or as WKT because java.text.ParseException: Unknown Shape definition [fdjlskfjdlkjfw]
W, [2024-02-23T14:19:01.621400 #41378] WARN -- GeoCombine: retrying documents individually
E, [2024-02-23T14:19:01.675005 #41378] ERROR -- GeoCombine: error indexing tmp/opengeometadata/fake/geoblacklight.json: 400 Bad Request - ERROR: [doc=VAC3073-M-01154] Error adding field 'solr_geom'='fdjlskfjdlkjfw' msg=Unable to parse shape given formats "lat,lon", "x y" or as WKT because java.text.ParseException: Unknown Shape definition [fdjlskfjdlkjfw]
I, [2024-02-23T14:19:03.313709 #41378] INFO -- GeoCombine: indexed 10084 documents in 34.38 seconds
This admittedly large PR happened because I wanted to set up regularly scheduled updates from OpenGeoMetadata for Earthworks via
cron
, and GeoCombine seemed like the best way to do it (https://github.com/sul-dlss/earthworks/issues/639).It tackles several outstanding issues:
Blacklight
is available to import (i.e. in a GeoBlacklight installation), automatically target the solr core configured for it (#166)