optimizely / express-middleware

FullStack & Rollouts Express Middleware
Apache License 2.0
2 stars 3 forks source link

An idea on deploying to multiple hosts #6

Open stevenpetryk opened 4 years ago

stevenpetryk commented 4 years ago

Hey again! Just following up on this really nice callout in the README:

Note: If you deploy your server to multiple different machines, this will not ensure that the two machines are in-sync with the latest configuration. If you would like to see support for cross-machine syncing via webhooks, please let us know by opening an issue on this repository.

This is us—we have a few EC2 hosts, with memcached and S3 as potential nearby datastores for the datafile. We would probably switch to webhooks + memcached soon if this library supported such a case, but I'm not sure it does. I've thought of a few ways that the library could be altered to support this.

Please don't view this as a request, it's just an idea meant to help shape your roadmap with customer issues :)

Allowing people to use a custom cache adapter

Right now, the datafile is cached in-memory. This library maintains its own copy of the datafile, and the Optimizely SDK instances it creates have their own datafile copies too (though, I think on most requests, req.optimizely.datafile will be the middleware's copy).

We can't use memcached with this setup because we have no way of overriding how the library stores and retrieves the datafile. If instead, we could specify our own cache, we could write a small wrapper that writes through to memcached (or any distributed cache, even S3), but primarily reads from memory.

const optimizelyExpress = require('@optimizely/express');
const optimizely = optimizelyExpress.initialize({
  sdkKey: '...',
  datafileOptions: {
    autoUpdate: true,
    updateInterval: 600000 // 10 minutes in milliseconds
  },
  datafileCache: {
    get () { /* read from memory, update memory from memcached */ },
    set (newDatafile) { /* write to memcached, update memory */ }
  }
})

get would be called by the library whenever providing the datafile to requests or sending the datafile to the main SDK. set would get called in response to webhooks, polling updates, or any other updates.

asaschachar commented 4 years ago

@stevenpetryk Thanks for the suggestion! It will certainly help shape our roadmap.

Can I ask why you would switch to webhooks + memcached? Is it because you want datafile updates to be in as real-time as possible? Or do you want to reduce the amount of networking from the SDKs themselves?

If you'd like to reduce networking from the SDKs, then our latest recommended architecture is to use Optimizely Agent (https://docs.developers.optimizely.com/full-stack/docs/optimizely-agent), a containerized Optimizely as-a-service.

Once you deploy Optimizely Agent in your infrastructure, you can have Optimizely agent make polling requests for the datafile, receive webhooks for the datafile, and store the latest datafile. All networking for the datafile through Optimizely's CDN is centralized to this Agent.

Then you can use the urlTemplate parameter of the SDK so that your SDKs will use Optimizely Agent as the source of truth for your configuration across hosts rather than fetching the datafile from the CDN on their own.