bigs commented 5 years ago

Test Scenario Runners

Scenarios encode behavior of nodes in the test network. These scenarios are analyzed via metrics output. How should they work?

Overview

At present, topologies are defined using a pretty simple JSON file. I'd like to extend these with a plugin for test scenarios. Scenarios control daemons via the control protocol, creating simulated work loads that we can then analyze via grafana, etc.

General implementation guidelines

There are two major pieces to this work.

Scenario Plugin

The testlab will have to be extended with a new plugin type that executes scenarios and populates their runtime environment with necessary information to discover the daemon network. This plugin will have to have a standardized environment that all scenarios can depend on to exist. It should probably include access to consul for service discovery and some sort of scoping to limit it to a subset of peers tagged with metadata.

Scenario Framework

Golang library that leverages the scenario plugin environment to power these scenarios. Open questions:

How many scenario runners per peer?
What does an ergonomic DSL for encoding node behaviors look like?
Should scenario runners use consul to synchronize over certain tasks i.e. common keys to put/get from DHT?

yusefnapora commented 5 years ago

Could we use Consul's service segmentation to simulate network partitions? It would be nice when testing e.g. circuit relay to know that two nodes can't communicate directly and have that enforced by the test setup.

anacrolix commented 5 years ago

YAML might be better suited, particularly for embedding snippets of stuff for other languages (bash etc.), keys, parts of scenarios etc., and the fact that humans will be writing these?

As a potential writer of scenarios I'd expect to be able to describe:

global_config:
  # some stuff like run name, permissions, consul and coordinator service configurations etc.
nodes:
  - num: 50
     network: &nat # some nat nonsense
     steps:
       - &install |
         go get dht
         some_other_install
       - |
         generate_data
         run_dht
     metrics: # how to generate metrics
  - num: 500
    <<: *nat # run the same nat as the above guys
    steps:
      - *install
      - |
        run_dht_to_get_data
    metrics:

Probably some synchronization will be necessary so nodes are in expected states before certain steps run. Some variance between nodes might be necessary, for example exposing an instance ID as an environment variable or something so instances can deterministically generate random data/keys/peer IDs etc.

Is this the kind of input you're after?

bigs commented 5 years ago

@yusefnapora interesting idea! partitions are not yet on the roadmap, but would love some more ideas on that matter!

bigs commented 5 years ago

@anacrolix yeah, this is absolutely lovely. i'm definitely a YAML fan, and it's a drop in replacement so i'm totally okay with that. as far as the "steps" go, i don't want to recreate docker, so i'd want to limit the instruction set to a well specified DSL. that's what i'm interested in establishing.

my gut feeling is that for sake of learning what we want to do and getting something done quickly, we start by just exposing a programmatic API for controlling daemons via the control protocol. this API will include:

environment variables for consul access
command line options/environment variables detailing which "segment" of the network they are to control (this will correlate to a service in consul)
seamless api for leveraging these inputs to create daemon clients
simple api for performing basic tasks over a collection of daemon clients
- churn (kill and redeploy task group)
- periodic DHT queries (fixed / random CIDs)
- peer discovery

lanzafame commented 5 years ago

@yusefnapora @bigs just making sure you are aware the service segmentation is an enterprise feature of Consul and not a part of the open source version. Not saying it isn't worth paying for, just that we will have to pay.

bigs commented 5 years ago

thanks! fortunately, we don’t need it to execute what we want.

On Tue, Mar 12, 2019 at 03:13 Adrian Lanzafame notifications@github.com wrote:

@yusefnapora https://github.com/yusefnapora @bigs https://github.com/bigs just making sure you are aware the service segmentation is an enterprise feature of Consul and not a part of the open source version. Not saying it isn't worth paying for, just that we will have to pay.

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/libp2p/testlab/issues/2#issuecomment-471883562, or mute the thread https://github.com/notifications/unsubscribe-auth/AANBWileCOkK-lM27GKftSv4DXwOpMrXks5vV1ObgaJpZM4a0yky .

libp2p / testlab

[Discussion] Test Scenario Runners #2

Test Scenario Runners

Overview

General implementation guidelines

Scenario Plugin

Scenario Framework