ystia / yorc

Ystia Orchestrator
https://ystia.github.io
Apache License 2.0
67 stars 23 forks source link

Storing everything in consul is not sustainable #360

Closed fcdenos closed 4 years ago

fcdenos commented 5 years ago

Is your feature request related to a problem? Please describe. At SG we are going to be a pretty intensive user of A4C/Yorc for our deployments. Unfortunately, at some (early) point of our migrations we got out of memory problems. We found that, at least for our usage, storing everything (topology elements, types, logs, ....) in consul is just not sustainable. We've had, and are still having, lots of performance problems, a good part of which comes from this fact.

The main pain points are :

Describe the solution you'd like I think an overhaul of the product architecture, especially the way data is stored, needs to be done if it is to be used in intensive environments. The most obvious suggestion to me (even if others exist) seem to be :

Describe alternatives you've considered

Additional context

loicalbertin commented 5 years ago

Hi @fcdenos,

Thanks for your feedback. Could you please elaborate a bit more by sharing some metrics, how many deployments are you deploying before encountering memory issues on Consul, how many keys are present under _yorc/deployments, _yorc/events, _yorc/logs if possible could you please have an idea (at least in order of magnitude) of the size for each of those paths.

We are thinking for a while to make the logs and maybe even the events storage plugable to allow to integrate with other systems.

Regarding the way we store deployments definitions it may indeed contains many keys but there individual size should be relatively low.

mguillon commented 5 years ago

Hello, Today we have 312 deployments:

On filesystem side:

loicalbertin commented 5 years ago

Hi @fcdenos & @mguillon

I'm currently investigating this issue and how to deal with it. First here is how I estimate the size of a subpath of consul (it's approximative):

curl -s "http://127.0.0.1:8500/v1/kv/_yorc/deployments?recurse" | jq '[.[] | (.Value + "" | @base64d ) + .Session + .Key | utf8bytelength + 32] | add '

Note: jq 1.6+ is needed

A single entry looks like this:

   {
        "LockIndex": 0,
        "Key": "_yorc/deployments/Test-Environment/workflows/uninstall/steps/FIPCompute_2_uninstall/target",
        "Flags": 0,
        "Value": "RklQQ29tcHV0ZV8y",
        "Session": "",
        "CreateIndex": 10826,
        "ModifyIndex": 10826
    }

LockIndex, CreateIndex, ModifyIndex and Flags are uint64 so they consume 8 bytes each = a total of 32 bytes per key. Key, Value and Session are strings so their size depends on their content length. Note that Value appear here in base64 but is stored internally in plain text within consul inmem db.

To estimate each key size, the idea is to concatenate string values and to compute there size in byte and finally add 32 bytes for the 4 uint64.

That said I identified several ways to reduce the size of the _yorc/deployments subpath:

  1. We store parts of the TOSCA definition in Consul that we do not use. I made a simple test by just stopping to store description of TOSCA types in a very basic application and it reduced by 22% the computed size.
  2. Sometimes we store keys with an empty value, it maybe improved by expecting that an empty key could be missing in Consul, this should be checked has it may have some side effects.
  3. There is (a lot of) duplicated types. Builtin TOSCA types are stored under every deployment subpath. Types are the most storage consuming part of a deployment. We can imagine to store builtin types only once.
  4. To limit the number of keys we may pack some keys into a single for instance json-marshaled key, we will then have to unmarshal it when we will have to use a part of it. This could be interesting especially for immutable keys like TOSCA types definitions.
stebenoist commented 5 years ago

+1. This approach to reduce/organize the way we store data in Consul KV sounds good to me.

loicalbertin commented 5 years ago

Hi @fcdenos & @mguillon,

A first refactoring is merged in develop and will be contained in 3.2.0-M5. It removes some stuff stored in Consul that are not used. A Consul schema migration tool is also integrated and will update your running database to remove useless keys.

A major refactoring that will remove duplicated Yorc's builtin types across deployments #371 is planned for our next sprint.

loicalbertin commented 4 years ago

Considering this issue closed with alternative stores implementations that allows to store data elsewhere.