elastic / rally

Macrobenchmarking framework for Elasticsearch
Apache License 2.0
1.95k stars 313 forks source link

Fix running tracks across multiple load driver machines #1763

Closed b-deam closed 1 year ago

b-deam commented 1 year ago

This commit adds a 'new' Bootstrap message handler to the Worker actor, which is responsible for ensuring that the new actor has the correct configuration and track modules loaded into its system path. This restores the ability to run tracks across multiple load driver machines.

The reason tracks worked across Worker actors on coordinator machine (where the race was invoked) is because the default behaviour for new actors created with multiprocTCPBase is that they fork (in the Unix sense) from their parent process. When a process is forked, it inherits the memory and other properties from the parent, including the system path.

When a new Worker was created on a remote load driver machine the existing parent was missing paths, meaning there were missing modules for the load path.

Notes:

Fixes https://github.com/elastic/rally/issues/1752 Relates https://github.com/elastic/rally/issues/1206

b-deam commented 1 year ago

buildkite test this please

b-deam commented 1 year ago

Hmm, there's some race condition here related to multiple workers on the same host all loading the track (i.e. running git operations) in parallel:

2023-08-21 04:42:59,804 ActorAddr-(T|:61276)/PID:44024 esrally.utils.process INFO fatal: Unable to create '/Users/bradleydeam/.rally/benchmarks/tracks/rally-tracks-compat/.git/index.lock': File exists.

I'll work out a fix.

b-deam commented 1 year ago

Confirmed working as of latest commit:

image