FoundationDB / fdb-joshua

FoundationDB Correctness service
Apache License 2.0
28 stars 25 forks source link

Describe how to run the standard simulation tests with joshua #10

Open oleg68 opened 3 years ago

oleg68 commented 3 years ago

There are lots of simulaton tests of fondationdb in the test subdirectory of the sourcecode.

joshua requires a tarball fo running a test. What should be in the tarball for running the standard simulation tests from the test subdirectory? Do I need to pack the test subdirectory to the tarball? Do I need to pack fdbserver and other executables in the tarball?

It would be nice to have an example of such tarball described in the readme.md

jzhou77 commented 3 years ago

Yes, that's something missing in the README.md.

FYI, the tarball can be generated when building foundationdb, e.g., ninja package_tests. The package is located at cmake_outputdir/packages/correctness-VERSION.tar.gz.

oleg68 commented 3 years ago

After

python3 -m joshua.joshua -C ../devops/clusters/joshua/fdb.cluster start --tarball '/home/oleg/work/fdb/FoundationDb/bld/packages/correctness-6.2.33.tar.gz'

something went wrong. The agent failed with

[oleg@oleg2 FdbJoshua]$ docker run --rm  --security-opt label=disable -v /home/oleg/work/fdb/devops/clusters/joshua:/opt/joshua -it foundationdb/joshua-agent:latest
Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
Exception in thread Thread-2:
Traceback (most recent call last):
  File "/opt/rh/rh-python38/root/usr/lib64/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/opt/rh/rh-python38/root/usr/lib64/python3.8/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/opt/rh/rh-python38/root/usr/local/lib64/python3.8/site-packages/joshua/joshua_agent.py", line 658, in agent
    retcode = run_ensemble(chosen_ensemble, save_on, work_dir=work_dir, timeout_command_timeout=timeout_command_timeout)
  File "/opt/rh/rh-python38/root/usr/local/lib64/python3.8/site-packages/joshua/joshua_agent.py", line 362, in run_ensemble
    for k, v in env_settings:
ValueError: not enough values to unpack (expected 2, got 1)
jzhou77 commented 3 years ago

This seems to be a bug introduced by #3.

jzhou77 commented 3 years ago

Can you try edit line 360 of /opt/rh/rh-python38/root/usr/local/lib64/python3.8/site-packages/joshua/joshua_agent.py to:

    if 'env' in properties and properties['env']:

I think this change can fix the bug.

oleg68 commented 3 years ago

I couldn't test the change you proposed, but tested #12. The agent stopped crashing,

[oleg@oleg2 FdbJoshua]$ docker run --rm  --security-opt label=disable -v /home/oleg/work/fdb/devops/clusters/joshua:/opt/joshua -it foundationdb/joshua-agent:latest
Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
Unpacking/var/joshua/ensembles/20210409-151019-oleg-618f4164d06d707d
20210409-151019-oleg-618f4164d06d707d6151521263419774359./joshua_test
103

20210409-151019-oleg-618f4164d06d707d4568811136083724774./joshua_test
103

20210409-151019-oleg-618f4164d06d707d5441746154163069208./joshua_test
103

20210409-151019-oleg-618f4164d06d707d3795326571262142719./joshua_test
103

20210409-151019-oleg-618f4164d06d707d2012513511425241517./joshua_test
103

20210409-151019-oleg-618f4164d06d707d4496758372103192668./joshua_test
103

20210409-151019-oleg-618f4164d06d707d2537493284973285639./joshua_test
103

20210409-151019-oleg-618f4164d06d707d6154259346310925684./joshua_test
103

20210409-151019-oleg-618f4164d06d707d1768467256300473424./joshua_test
103

20210409-151019-oleg-618f4164d06d707d6198835866733904722./joshua_test
103

20210409-151019-oleg-618f4164d06d707d3421236884922169869./joshua_test
<jobstopped>
20210409-151019-oleg-618f4164d06d707d805089129740123916./joshua_test
<jobstopped>
20210409-151019-oleg-618f4164d06d707d2605797536880722932./joshua_test
<jobstopped>
removing 20210409-151019-oleg-618f4164d06d707d /var/joshua/ensembles/20210409-151019-oleg-618f4164d06d707d

But I cann't see their result:

[oleg@oleg2 FdbJoshua]$ python3 -m joshua.joshua -C ../devops/clusters/joshua/fdb.cluster tail
No active ensembles
jzhou77 commented 3 years ago

The tail command looks for the active ensemble or the given one, so in order to see your results, use:

python3 -m joshua.joshua -C ../devops/clusters/joshua/fdb.cluster tail 20210409-151019-oleg-618f4164d06d707d

You can optionally give --errors --xml arguments.

oleg68 commented 3 years ago
  1. Is there a capability of displaying a list of tests ran in the past?
  2. Is there a forum to discuss joshua?
jzhou77 commented 3 years ago
  1. Is there a capability of displaying a list of tests ran in the past?

Yes. Use python3 -m joshua.joshua list --stopped.

  1. Is there a forum to discuss joshua?

I think https://forums.foundationdb.org/ could be a good place.

oleg68 commented 3 years ago

I started a topic in https://forums.foundationdb.org/t/simulation-testing-of-foundationdb/2654 tu discuss how to run tests.

Seems some extra info should be added to README.md

PierreZ commented 3 years ago

I would love a full example which include:

My usecase is to run some simulation on the rust client using the bindingTester.