Old route info gets overwritten on every POST to /routing-table

sudomesh / monitor

a way to monitor health of (people's open) network

GNU General Public License v3.0

5 stars 7 forks source link

Old route info gets overwritten on every POST to /routing-table #11

Closed bennlich closed 6 years ago

bennlich commented 6 years ago

So if a home node goes offline, it disappears from the monitor on the next update (as opposed to lingering around with an old timestamp).

Need to change the way nodes are stored so they persist in the DB for some amount of time.

jhpoelen commented 6 years ago

As far as I know, memcache does not support querying keys, however I believe that redis does. Also, it looks like heroku has a redis add on (https://elements.heroku.com/addons/heroku-redis). Also, redis supports TTL, a feature that we currently rely on to detect an inactive gateway. One idea would be to switch from memcached to redis and store individual route info with some prefixed keys with no or long expiry. Alternatively, one could merge the old and new value of the existing value in the nodes memcached key . . . but that sounds kinda messy.

Curious to hear thoughts on this.

eenblam commented 6 years ago

Would it make more sense to just switch to a SQL datastore? It seems like we're trying store (and more importantly query) more complex data, and if we're going to refactor for SQL at some point we might as well do it early.

jhpoelen commented 6 years ago

Seems a bit heavy handed to me: creating/maintaining schemas etc. Besides, redis is a persistent data store with queries also. And as far as complexities go - I'd say, only introduce when we need to in order to avoid premature optimization. I am sure that others have some opinions about this too. . .

eenblam commented 6 years ago

We do have to pay for persistence with heroku-redis.

This isn't a blocker, and we'll inevitably need to start spending a bit more on our infrastructure anyway.

jhpoelen commented 6 years ago

Nice catch! I assumed that the free tier had some persistence. Hmm. I'd prefer to avoid maintenance (e.g., install / update / migrate schema's) and get something that is pretty easy to test and deploy. Perhaps just run monitor/redis on existing droplet instead?

bennlich commented 6 years ago

Ya I'm all for avoiding schemas, especially for stuff that's probably gonna change shape a bunch.

But I think we can keep using memcache, no? Why do we need to do any querying? Can key on (destIP, gatewayIP).

eenblam commented 6 years ago

My thought is that we might also want more persistence in general in order to look at our data longitudinally, instead of just seeing the current state of things. It would be cool to see network health change over time, and it would be helpful to have better data on what failed, when, and what else was happening at that time. SQL seems like the right fit there, and it would solve this problem, but you're right that we can solve the problem without.

I'm not attached to Memcache or Redis in particular, so I'm fine with this getting wrapped up with either.

eenblam commented 6 years ago

But I think we can keep using memcache, no? Why do we need to do any querying? Can key on (destIP, gatewayIP).

The solution to the above problem is, "Create a separate entry for each log, instead of caching the whole table, and update individual records selectively instead of replacing the whole table" correct?

But, in addition to that, I'm thinking in terms of "monitoring over time" and providing an interface to data about the mesh.

Maybe it makes more sense for this particular service to act as an uptime monitor only, and separate those concerns?

bennlich commented 6 years ago

Extending this thing to monitor over time makes sense to me.

eenblam commented 6 years ago

@bennlich but you're right that for this particular case we can just alter the key to be more selective

eenblam commented 6 years ago

@bennlich After recent changes:

When receiving a routing table update from an exit node with IP X, we add the whole table under the key node-${X}. When retrieving the table, we just iterate over all exit node IPs and retrieve node-${IP} for each. This is easy for now, because the exit node IPs are currently available to the application and, for the time being, fixed.

It would be easy to store individual rows as node-${exitIP}-${route}, but what would retrieval look like?

We could still switch heroku-redis to get querying without making the jump to paying for a higher tier to get persistence, or we could switch to Mongo or whatever. I'm excited about whatever keeps things simplest for the time being.