elastic / rally-tracks

Track specifications for the Elasticsearch benchmarking tool Rally
19 stars 181 forks source link

Synthetic source does not still support `copy_to` #652

Closed salvatore-campagna closed 1 month ago

salvatore-campagna commented 1 month ago

This causes the elastic/security track to fail execution when index_mode is set to logsdb. This is happening because LogsDB uses synthetic source which, in turn, does not support copy_to. Supporting copy_to is expected to come in Elasticsearch 8.16. In the meanwhile we just exclude the copy_to setting from the mapping so to avoid triggering the error.

This needs backporting to 8.15.

salvatore-campagna commented 1 month ago

Since copy_to is not used under LogsDB, I need to go through all queries to make sure they use kubernetes.event.message instead of the empty message. Otherwise queries might fail, or worse be much faster just because the message field is empty.

salvatore-campagna commented 1 month ago

I tried adding a workflow-folter parameters which we can use to override the queries executed for logsdb. We would just need to pass an additional track parameter workflow-folder set to workflows-logsdb when executing with logsdb as the index.mode.

@gareth-ellis @charlie-pichette does it sound reasonable?

We will revert this once synthetic source supports copy_to.

salvatore-campagna commented 1 month ago

BTW this would have been much easier if this track used the standard logs@settings component template. @charlie-pichette any chance to do that instead?

salvatore-campagna commented 1 month ago

I executed a test for the stateful case and double checkd all datastream backing indices are actually using LogsDB. I used a very small dataset to have a fast benchmark whose purpose was just to double check LogsDB was used.

esbench@elasticsearch-0:~$ curl -XGET -s -k -u esbench:super-secret-password "https://elasticsearch-0:9200/.ds-packetbeat-default-2024.09.04-000001/_settings?pretty" | grep "logsdb"
        "mode" : "logsdb",
esbench@elasticsearch-0:~$ curl -XGET -s -k -u esbench:super-secret-password "https://elasticsearch-0:9200/.ds-auditbeat-default-2024.09.04-000001/_settings?pretty" | grep "logsdb"
        "mode" : "logsdb",
esbench@elasticsearch-0:~$ curl -XGET -s -k -u esbench:super-secret-password "https://elasticsearch-0:9200/.ds-filebeat-default-2024.09.04-000001/_settings?pretty" | grep "logsdb"
        "mode" : "logsdb",
esbench@elasticsearch-0:~$ curl -XGET -s -k -u esbench:super-secret-password "https://elasticsearch-0:9200/.ds-metricbeat-default-2024.09.04-000001/_settings?pretty" | grep "logsdb"
        "mode" : "logsdb",
esbench@elasticsearch-0:~$ curl -XGET -s -k -u esbench:super-secret-password "https://elasticsearch-0:9200/.ds-winlogbeat-default-2024.09.04-000001/_settings?pretty" | grep "logsdb"
        "mode" : "logsdb",
esbench@elasticsearch-0:~$ curl -XGET -s -k -u esbench:super-secret-password "https://elasticsearch-0:9200/_cat/indices"
yellow open .ds-packetbeat-default-2024.09.04-000001 q1ZIl_THTFahWAA80Qqn_A 1 1   0 0    249b    249b    249b
yellow open .ds-auditbeat-default-2024.09.04-000001  kfKD56XeSEeO7qRk4gBsmA 1 1   0 0    249b    249b    249b
yellow open .ds-filebeat-default-2024.09.04-000001   3rlOpJ37T7W2woFGrJ_Cww 1 1 600 0 427.9kb 427.9kb 427.9kb
yellow open .ds-metricbeat-default-2024.09.04-000001 ELjRLFRXQc2RPQ5K1XvnlQ 1 1   0 0    249b    249b    249b
yellow open .ds-winlogbeat-default-2024.09.04-000001 dfDfSrA5RHadWk5x8jTyLg 1 1   0 0    249b    249b    249b
salvatore-campagna commented 1 month ago

The track completed succesfully

2024-09-04 10:27:15,223 ActorAddr-(T|:39015)/PID:5993 esrally.reporter INFO |                                                         Metric |                    Task |          Value |   Unit |
|---------------------------------------------------------------:|------------------------:|---------------:|-------:|
|                     Cumulative indexing time of primary shards |                         |    0.00583333  |    min |
|             Min cumulative indexing time across primary shards |                         |    0           |    min |
|          Median cumulative indexing time across primary shards |                         |    0           |    min |
|             Max cumulative indexing time across primary shards |                         |    0.00583333  |    min |
|            Cumulative indexing throttle time of primary shards |                         |    0           |    min |
|    Min cumulative indexing throttle time across primary shards |                         |    0           |    min |
| Median cumulative indexing throttle time across primary shards |                         |    0           |    min |
|    Max cumulative indexing throttle time across primary shards |                         |    0           |    min |
|                        Cumulative merge time of primary shards |                         |    0           |    min |
|                       Cumulative merge count of primary shards |                         |    0           |        |
|                Min cumulative merge time across primary shards |                         |    0           |    min |
|             Median cumulative merge time across primary shards |                         |    0           |    min |
|                Max cumulative merge time across primary shards |                         |    0           |    min |
|               Cumulative merge throttle time of primary shards |                         |    0           |    min |
|       Min cumulative merge throttle time across primary shards |                         |    0           |    min |
|    Median cumulative merge throttle time across primary shards |                         |    0           |    min |
|       Max cumulative merge throttle time across primary shards |                         |    0           |    min |
|                      Cumulative refresh time of primary shards |                         |    0.00271667  |    min |
|                     Cumulative refresh count of primary shards |                         |   21           |        |
|              Min cumulative refresh time across primary shards |                         |    0           |    min |
|           Median cumulative refresh time across primary shards |                         |    0           |    min |
|              Max cumulative refresh time across primary shards |                         |    0.00271667  |    min |
|                        Cumulative flush time of primary shards |                         |    0.00025     |    min |
|                       Cumulative flush count of primary shards |                         |    5           |        |
|                Min cumulative flush time across primary shards |                         |    3.33333e-05 |    min |
|             Median cumulative flush time across primary shards |                         |    3.33333e-05 |    min |
|                Max cumulative flush time across primary shards |                         |    0.000116667 |    min |
|                                        Total Young Gen GC time |                         |    0.122       |      s |
|                                       Total Young Gen GC count |                         |    4           |        |
|                                          Total Old Gen GC time |                         |    0           |      s |
|                                         Total Old Gen GC count |                         |    0           |        |
|                                                   Dataset size |                         |    0.000409081 |     GB |
|                                                     Store size |                         |    0.000409081 |     GB |
|                                                  Translog size |                         |    2.56114e-07 |     GB |
|                                         Heap used for segments |                         |    0           |     MB |
|                                       Heap used for doc values |                         |    0           |     MB |
|                                            Heap used for terms |                         |    0           |     MB |
|                                            Heap used for norms |                         |    0           |     MB |
|                                           Heap used for points |                         |    0           |     MB |
|                                    Heap used for stored fields |                         |    0           |     MB |
|                                                  Segment count |                         |    4           |        |
|                                    Total Ingest Pipeline count |                         |  600           |        |
|                                     Total Ingest Pipeline time |                         |    0.369       |      s |
|                                   Total Ingest Pipeline failed |                         |    0           |        |
|                                                 Min Throughput |        insert-pipelines |   19.29        |  ops/s |
|                                                Mean Throughput |        insert-pipelines |   19.29        |  ops/s |
|                                              Median Throughput |        insert-pipelines |   19.29        |  ops/s |
|                                                 Max Throughput |        insert-pipelines |   19.29        |  ops/s |
|                                       100th percentile latency |        insert-pipelines |  649.265       |     ms |
|                                  100th percentile service time |        insert-pipelines |  649.265       |     ms |
|                                                     error rate |        insert-pipelines |    0           |      % |
|                                                 Min Throughput |              insert-ilm |   36.83        |  ops/s |
|                                                Mean Throughput |              insert-ilm |   36.83        |  ops/s |
|                                              Median Throughput |              insert-ilm |   36.83        |  ops/s |
|                                                 Max Throughput |              insert-ilm |   36.83        |  ops/s |
|                                       100th percentile latency |              insert-ilm |   80.0769      |     ms |
|                                  100th percentile service time |              insert-ilm |   80.0769      |     ms |
|                                                     error rate |              insert-ilm |    0           |      % |
|                                                 Min Throughput | bulk-index-initial-load |  198.98        | docs/s |
|                                                Mean Throughput | bulk-index-initial-load |  198.98        | docs/s |
|                                              Median Throughput | bulk-index-initial-load |  198.98        | docs/s |
|                                                 Max Throughput | bulk-index-initial-load |  198.98        | docs/s |
|                                        50th percentile latency | bulk-index-initial-load |   69.212       |     ms |
|                                        90th percentile latency | bulk-index-initial-load |  523.803       |     ms |
|                                       100th percentile latency | bulk-index-initial-load |  696.481       |     ms |
|                                   50th percentile service time | bulk-index-initial-load |   69.212       |     ms |
|                                   90th percentile service time | bulk-index-initial-load |  523.803       |     ms |
|                                  100th percentile service time | bulk-index-initial-load |  696.481       |     ms |
|                                                     error rate | bulk-index-initial-load |    0           |      % |
|                                                 Min Throughput |                   hosts |    0.1         |  ops/s |
|                                                Mean Throughput |                   hosts |    0.12        |  ops/s |
|                                              Median Throughput |                   hosts |    0.11        |  ops/s |
|                                                 Max Throughput |                   hosts |    0.16        |  ops/s |
|                                        50th percentile latency |                   hosts |  204.623       |     ms |
|                                        90th percentile latency |                   hosts | 1638.86        |     ms |
|                                       100th percentile latency |                   hosts | 2196.19        |     ms |
|                                   50th percentile service time |                   hosts |  201.545       |     ms |
|                                   90th percentile service time |                   hosts |  497.851       |     ms |
|                                  100th percentile service time |                   hosts | 1637.9         |     ms |
|                                                     error rate |                   hosts |    0           |      % |
|                                                 Min Throughput |                overview |    0.02        |  ops/s |
|                                                Mean Throughput |                overview |    0.04        |  ops/s |
|                                              Median Throughput |                overview |    0.04        |  ops/s |
|                                                 Max Throughput |                overview |    0.05        |  ops/s |
|                                        50th percentile latency |                overview |  341.835       |     ms |
|                                        90th percentile latency |                overview | 1151.38        |     ms |
|                                       100th percentile latency |                overview | 1346.12        |     ms |
|                                   50th percentile service time |                overview |  296.942       |     ms |
|                                   90th percentile service time |                overview | 1149.4         |     ms |
|                                  100th percentile service time |                overview | 1154.91        |     ms |
|                                                     error rate |                overview |    0           |      % |
|                                                 Min Throughput |                 network |    0.06        |  ops/s |
|                                                Mean Throughput |                 network |    0.09        |  ops/s |
|                                              Median Throughput |                 network |    0.09        |  ops/s |
|                                                 Max Throughput |                 network |    0.1         |  ops/s |
|                                        50th percentile latency |                 network |  241.289       |     ms |
|                                        90th percentile latency |                 network | 2146.94        |     ms |
|                                       100th percentile latency |                 network | 2827.87        |     ms |
|                                   50th percentile service time |                 network |  197.612       |     ms |
|                                   90th percentile service time |                 network |  888.805       |     ms |
|                                  100th percentile service time |                 network | 2366.76        |     ms |
|                                                     error rate |                 network |    0           |      % |
charlie-pichette commented 1 month ago

BTW this would have been much easier if this track used the standard logs@settings component template. @charlie-pichette any chance to do that instead?

I have no information about logs@settings, so I do not know what the impact is or how this is used in production. I am happy to evaluate the implications of changing the track to use that if you can provide me information on it.

achuguy commented 1 month ago

@salvatore-campagna Regarding the logs@settings component template we can add it to the composable templates https://github.com/elastic/rally-tracks/tree/master/elastic/security/templates/composable. I can create a separate PR unless you want to add it in this PR.

gareth-ellis commented 1 month ago

@elasticmachine update branch