habitat-sh / on-prem-builder

Scripts to stand up an on-premise Habitat Builder Depot
Apache License 2.0
41 stars 34 forks source link

Configurable Memcached service #253

Open sajjaphani opened 2 years ago

sajjaphani commented 2 years ago

Currently, the on-prem setup will start a new Memcached service for each of the builder-api services. This creates a tight coupling between these two services and also poses difficulties in multi-node instances.

The data that is stored in Memcached primarily includes the following data:

  1. User Session data
  2. Package metadata

The data we are caching is not very significant? for an on-prem setup assuming

  1. the traffic is relatively low
  2. most of the traffic flows within a local network (intranet?)

From the above observations, we can propose the following approaches:

  1. Make Memcached optional
  2. Provide an option to supply the Memcached config

Make Memcached optional It may not look ideal, but for small loads, it will be a better option than having to set up a Memcached instance(s).

If we make the Memcached optional there is will not be a significant impact on the Package metadata. However, the problem will come when dealing with the user session data. We can handle the session management in one of the following ways: Persist session data in the database. configure the servers to use persistent sessions. If sessions are persisted directly in the database, then all the API services will see the same data in the session. The major drawback is that it requires DB access on each request. Avoid using sessions by using a signed cookie to identify the users, JSON web token for example. This can ensure scalability and failover. It also poses its own set of problems.

Provide an option to supply the Memcached config

There is no one-size-fits-all solution. There will be tradeoffs between each of the solutions proposed above. We can gather feedback on the proposals (or new proposals) before choosing one or more of the above. It seems spinning a single instance and making all the API instances use that might be a reasonable choice, in my opinion.

rahulgoel1 commented 2 years ago

Thanks for sharing the thoughts. I also feel that we need to revisit the usage of memcached as a caching solutions. Do we even need distributed caching or can we manage the session in DB and still be able to meet performance for our app ?