zeebe-io / benchmark-helm

Contains a helm chart to execute zeebe benchmarks
https://zeebe-io.github.io/benchmark-helm/
Apache License 2.0
3 stars 0 forks source link

Helm chart still uses curator #126

Closed Zelldon closed 10 months ago

Zelldon commented 11 months ago

After or during #125 we should make sure to migrate to ILM and configure it correctly.

Important to note is that we need to check how we handle old runs, for old releases.

Zelldon commented 10 months ago

Copied from Slack:

:til: ILM min_age starts from the last update of your index, or at least this is what I observed.

I am updating our benchmark helm charts to the most recent version, including an upgrade to ES 8 and ILM. Unfortunately, we generate a lot of data, which is why we used Curator before quite heavily to delete data every 15 minutes, when the index has a certain size, etc.

It looks like with ILM you can set min_age (in our config) and max_size (not part of our config). I have set ILM min_age to 10m, and had hoped that every 10 minutes the index would be deleted.

This is not what happens. It will only happen if the corresponding index is no longer updated, so either when stopping load (which I did for testing purposes) or when we create a new index, which we do every new day. This also means if we create a new index (because of a new day) then the min_age starts ticking. For example, if we have one day configured as min_age, this means another day needs to be completed before the index is deleted. This was a bit of a surprise to me (I think this changed to the previous behavior).

I thought this might be interesting for you as well. If someone feels to be an ILM expert I'm would be happy to pair a bit on how we could get our benchmarks running again.

2024-01-15_13-21

Zelldon commented 10 months ago

Copied from slack:

Chris: I feel like hitting a bit of a dead end regarding the ILM policies and our benchmarks.

I see the following options:

  1. :yolo: Dont care and just increase the ES disk size to a size that will survive for a day (currently we just use 16 gig)
  2. :burn: Create a new cron job that runs every 15 minutes, and deletes ALL indices (or maybe indices which are bigger than X, but this will make it more complicated)
  3. :male-detective::skin-tone-3: Look into a different indexing strategy, like creating indexes per hour instead of days OR per size (even better but not sure how complicated)

I'm currently thinking about option 3.2, to check how hard this is, time-box it, and if it doesn't work we might go with a bigger disk for now (since option 2 will definitely also cause issues with operate benchmarks).

I know that we got this request (about size-based indices) already before, so might be worth looking into.

Any other suggestions or ideas?

Chris: Roman I haven't looked into the ES exporter for a while, but afaik we don't use data streams right? Isn't this actually something we could think about ? https://www.elastic.co/guide/en/elasticsearch/reference/current/data-streams.html Looks like something useful for us ? :thinking_face:

Roman: Chris, no currently we don't use data streams. I thought about migrating the exporter to data streams in the past but haven't followed up on this so far. Ole: hourly indices would make a lot of sense to me :+1:

I made it possible to configure the index suffix still same defaults but so we can configure either a hourly based index as well or even weekly if we expect not much traffic. https://github.com/camunda/zeebe/pull/15953