openva / richmondsunlight.com

The Richmond Sunlight website.
https://www.richmondsunlight.com/
MIT License
12 stars 3 forks source link

Move search to ElasticSearch #37

Open waldoj opened 10 years ago

waldoj commented 10 years ago

Now that ElasticSearch is installed, index legislation with it, rather than Sphinx.

Done right, indexing legislation means exporting legislation as JSON. This should be as simple as generating a list of new legislation, making a request to the API for each one of those, and submitting that JSON to ElasticSearch.

waldoj commented 7 years ago

Use Elasticsearch's bulk importer. jq may be a good tool to get the JSON reformatted.

waldoj commented 7 years ago

It doesn't look like the provided document IDs (in the JSON header material) is being used by Elasticsearch. Elasticseach is just using random identifiers (e.g., IQSkQ3btTI6L8KEvdnnlnA). That's going to make updates impossible—we'll just have a ever-growing index, full of duplicates.

waldoj commented 7 years ago

I moved to using the bills' database IDs as the document IDs, and that fixed things. I'm not sure what was problematic about using e.g. 2019-hb123 as the document ID, but it wasn't working.

waldoj commented 7 years ago

I've got Elasticsearch configured on another server (to avoid taxing this one), with a rule opening up the firewall, but Elasticsearch is rejecting it. Figure out why, and ensure that the new firewall rule is loaded on reboot.

waldoj commented 7 years ago

Server moved successfully.