Open salvatore-campagna opened 1 month ago
@elastic/es-perf I see the existing elastic/logs
track has a cross-clusters-search-and-snapshot
challenge which does something similar to what I described above but restoring a snapshot to remote clusters (for CCS). Would it be possible to re-use that challenge and just restore the snapshot to the original cluster multiple times instead of restoring it to remote clusters?
Note that the new challenge needs to skip deleting a template, .fleet_globals-1
. Not skipping the delete operation results in an error later when trying to delete it. This component template happens to be used by one of the index templates installed by Elasticsearch.
The error is:
esrally.exceptions.RallyError: Cannot run task [delete-all-component-templates]: Request returned an error. Error type: api, Description: illegal_argument_exception ({'error': {'root_cause': [{'type': 'illegal_argument_exception', 'reason': 'component templates [.fleet_globals-1] cannot be removed as they are still in use by index templates [synthetics-browser.screenshot, synthetics-browser, synthetics-icmp, synthetics-http, synthetics-tcp, metrics-fleet_server.agent_status, metrics-fleet_server.agent_versions, synthetics-browser.network, logs-fleet_server.output_health]'}], 'type': 'illegal_argument_exception', 'reason': 'component templates [.fleet_globals-1] cannot be removed as they are still in use by index templates [synthetics-browser.screenshot, synthetics-browser, synthetics-icmp, synthetics-http, synthetics-tcp, metrics-fleet_server.agent_status, metrics-fleet_server.agent_versions, synthetics-browser.network, logs-fleet_server.output_health]'}, 'status': 400}), HTTP Status: 400
See https://github.com/elastic/rally-tracks/commit/625168845e488718ba488e138d53781b677a621a
We would like to run an experiment in Rally which uses considerable amount of data. The idea is to be able to fill the disk of an AWS instance with 7.5 TB of storage. Indexing such large amount of data poses at least two challenges, anyway, which are a result of the way the
elastic/logs
Rally track is designed:For our experiment described in an internal Jira ticket, we:
@timestamp
needs to change depending on how much data we need to index per each day (raw_data_volume_per_day
).is4gen.8xlarge
that has 4 x 7.5 TB = 30 TB of storage available. Note that if we assume x10 raw-to-json expansion we would need 75 TB of Json data to have 7.5 TB of raw data. This means that even the instance with the largest storage can't handle the amount of data we need.As a result, benchmarking this scenario is practically impossible because of resource constraints but also because of the time data generation and indexing would require.
So the idea is to adopt the following strategy which we would like to implement in a new challenge part of the
elastic/logs
track:raw_data_volume_per_day
)logging-querying
existing challenge to collect query latenciesFor the use case above where we need to fill the instance with 7.5 TB of raw data it means restoring the snapshot 75 times. We expect:
elastic/logs
track and thelogging-querying
existing challenge.An experiment configured like described above mimics and environment where 75 hosts are logging exactly the same dataset.
Note that the snapshot API is only available in on-prem deployments...which means we need to run the benchmark on-prem.