Open waldoj opened 10 years ago
Use Elasticsearch's bulk importer. jq may be a good tool to get the JSON reformatted.
It doesn't look like the provided document IDs (in the JSON header material) is being used by Elasticsearch. Elasticseach is just using random identifiers (e.g., IQSkQ3btTI6L8KEvdnnlnA
). That's going to make updates impossible—we'll just have a ever-growing index, full of duplicates.
I moved to using the bills' database IDs as the document IDs, and that fixed things. I'm not sure what was problematic about using e.g. 2019-hb123
as the document ID, but it wasn't working.
I've got Elasticsearch configured on another server (to avoid taxing this one), with a rule opening up the firewall, but Elasticsearch is rejecting it. Figure out why, and ensure that the new firewall rule is loaded on reboot.
Server moved successfully.
Now that ElasticSearch is installed, index legislation with it, rather than Sphinx.
Done right, indexing legislation means exporting legislation as JSON. This should be as simple as generating a list of new legislation, making a request to the API for each one of those, and submitting that JSON to ElasticSearch.