pelias / api

HTTP API for Pelias Geocoder
http://pelias.io
MIT License
218 stars 162 forks source link

Load type mapping before starting webserver #1625

Closed orangejulius closed 2 years ago

orangejulius commented 2 years ago

This is an extension to #1624 that ensures the type mapping is fully loaded from Elasticsearch before the Express webserver is started.

In my testing of #1624 I noticed that it's possible for a few requests to sneak by before the type mapping has loaded (either with success or failure).

This means in the case where valid type mapping data can't be loaded from Elasticsearch (because there's no data or no Elasticsearch cluster correctly configured at all), the API can successfully start up, attempt to serve requests (though they would probably result in errors), and then quit.

This does not play well with health checks like those that come with Kubernetes, ELBs, etc. Instead what we want is for the webserver to not come up at all until it's proven that the type mapping has loaded.

The change to do this wasn't too complicated but it ends up messy: a ton of code depends on the type mapping having been loaded beforehand, including tests. While having the type mapping .load() call happen before the Express webserver is started is desired, I'd love to find a way to avoid having to do something similar before running (or even require-ing) the unit test code. @missinglink if you have any ideas let me know.

missinglink commented 2 years ago

Hmm yeah I feel like the unit tests requiring a wrapper is an indication of the control flow being amiss.

Could we just add an EventEmitter to the type mapping which emits a load event?

We could then subscribe to events with a .once() function which wraps server.bind() but not the unit tests?

missinglink commented 2 years ago

Or even simpler the health check endpoint has a bool which is false and becomes true upon receiving the event, then we return a 400 for false and a 200 for true

missinglink commented 2 years ago

Something like this?

diff --git a/controller/status.js b/controller/status.js
index 7c87af6d..4a6aeac8 100644
--- a/controller/status.js
+++ b/controller/status.js
@@ -1,3 +1,10 @@
+const type_mapping = require('../helper/type_mapping');
+
 module.exports = function controller( req, res, next) {
+
+  if (!type_mapping.loaded) {
+    return res.status(500).send('status: loading');
+  }
+
   res.send('status: ok');
 };
diff --git a/helper/TypeMapping.js b/helper/TypeMapping.js
index 951dd6ba..30559bd6 100644
--- a/helper/TypeMapping.js
+++ b/helper/TypeMapping.js
@@ -42,6 +42,11 @@ var TypeMapping = function(){
    * and returns either that layer, or all the layers in the alias
    */
   this.layer_mapping = {};
+
+  /*
+   * Boolean to indicate if the mapping has completed loading
+  */
+  this.loaded = false;
 };

 TypeMapping.addStandardTargetsToAliases = function(standard, aliases) {
@@ -125,7 +130,12 @@ TypeMapping.prototype.loadTargets = function( targetsBlock ){
 };

 // load values from either pelias config file or from elasticsearch
-TypeMapping.prototype.load = function( done ){
+TypeMapping.prototype.load = function( cb ){
+
+  var done = () => {
+    this.loaded = true;
+    cb();
+  }

   // load pelias config
   const peliasConfigTargets = _.get(
missinglink commented 2 years ago

I guess the issue with this approach would be that some scripts may assume that once the server is bound that it is available.

So we'd either need to check both that the port is bound AND the healthcheck is successful before sending requests.

We can certainly add that to the pelias binary but would it be inflexible in other environments? 🤔

FWIW that's how ES works, it can be up and also in red status.

orangejulius commented 2 years ago

I like the EventEmitter idea. It's a bit of a shame since the TypeMapping class already has a callback parameter all set up, but it does make things complicated.

Using a bool like your code sample would probably work in most cases (it would be just fine for Kubernetes health checks which do look at the HTTP code). But you're right that it's not a 100% clean solution either.

EventEmitters aren't too hard to work with, so lets try that out!

orangejulius commented 2 years ago

Ah, I realized the EventEmitter won't buy us anything. The issue being worked around is that the TypeMapping.load call must complete before any of the unit tests are even require-d. I think instead we can and should fix that issue.

The issue I saw in the tests come down to the list of canonical_layers being empty at startup. So perhaps we can move the code that initializes that from the TypeMapping.load method to the constructor. That way, lines of code like this one will continue to work properly.

missinglink commented 2 years ago

Agh I didn't realize what the issue was with the tests, but yeah the initial load from the config can be sync and then the http req can be async.

orangejulius commented 2 years ago

Okay, that was relatively simple. The commit in this PR now splits the TypeMapping.load() function into two parts. The defaults are loaded from pelias-config as the type_mapping singleton file is require-d. Changes from Elasticsearch are optionally loaded later, and the callback for that async method is used in index.js to wait on starting the Express webserver.

This should be pretty foolproof and ensure no HTTP requests are served (and indeed the port isn't even open) until the type mapping is loaded and we are sure there's a valid Elasticsearch cluster configured.

orangejulius commented 2 years ago

This work has been brought into #1624 for final release