Four test failures on clean build

rengolin commented 3 years ago

Describe the bug Running the tests on a clean build leads to test failures.

The following tests FAILED:
     43 - modules_test (Failed)
     51 - partitions_cft (Failed)
     58 - vegeta_stress_cft (Failed)
     59 - vegeta_stress_bft (Failed)

To Reproduce

$ cmake -DTARGET=virtual ..
$ ninja
$ TEST_ENCLAVE=virtual ./tests.sh --output-on-failure

Environment information Ubuntu 20.04 Intel W-2155 (no SGX)

Additional context

Modules test: No such file or directory: 'npm'. Needs npm installed but the script doesn't install it. I installed npm and the test still fails: Command '['npm', 'test']' returned non-zero exit status 1. Running that by hand gives me the error: no such file or directory, open '/home/rengolin/devel/msrc/CCF/build/package.json'. I don't know enough about npm to even begin to guess.

Vegeta error: No such file or directory: '/opt/vegeta/vegeta'. Is this part of the install? I can't find such binary on the build dir after building CCF from source. The only package search result I found for "vegeta" on Ubuntu was a "tomato smashing game".

Partitions error: iptc.ip4tc.IPTCError: can't initialize filter: b'Permission denied (you must be root)'. Seems like there's no work around. Could perhaps try to use sudo (and hope password is cached) if the uid isn't root?

achamayou commented 3 years ago

Did you follow the instructions outlined under: https://microsoft.github.io/CCF/main/build_apps/build_setup.html

In principle, running:

$ cd <ccf_path>/getting_started/setup_vm
$ ./run.sh ccf-dev.yml

Should result in npm being installed, because of https://github.com/microsoft/CCF/blob/main/getting_started/setup_vm/ccf-dev.yml#L15 and https://github.com/microsoft/CCF/blob/main/getting_started/setup_vm/roles/nodejs/tasks/install.yml

$ dpkg -S /usr/bin/npm
nodejs: /usr/bin/npm

The same goes for vegeta: https://github.com/microsoft/CCF/blob/main/getting_started/setup_vm/ccf-dev.yml#L18, https://github.com/tsenart/vegeta which is a load testing tool, no relation to tomatoes afaik.

The partition test does use iptables and needs root. Whether it's convenient and fine to sudo in case it's cached/set up passwordless, or a bit rude to the user is perhaps a matter of opinion. I do agree with you though, I think it would be better for us to do it.

rengolin commented 3 years ago

I missed the part that ccf-dev and app-dev had different dependencies. Running that playbook, I only get the root issue.

rengolin commented 3 years ago

Ok, so I've added root/sudo checks for tests.sh and the last test failing (partitions_cft) passes.

However, some other tests fail:

The following tests FAILED:
     60 - ls_sgx_cft (Failed)
     61 - ls_jwt_sgx_cft (Failed)
     62 - ls_js_sgx_cft (Failed)
     63 - ls_full_js_sgx_cft (Failed)
     64 - ls_js_jwt_sgx_cft (Failed)

All 5 with the error:

  File "/home/rengolin/devel/msrc/CCF/tests/infra/remote.py", line 112, in _get_perf
    raise ValueError(f"No performance result found (pattern is {pattern})")
                                                                └ '=> (.*)tx/s'
ValueError: No performance result found (pattern is => (.*)tx/s)

and log:

16:44:29.340 | INFO     | infra.rates:get_metrics:66 - [0||] GET /app/metrics
16:44:29.340 | INFO     | infra.rates:get_metrics:66 - 200 @2.13 {"histogram":{"buckets":null,"high":0,"low":0,"overflow":0,"underflow":234},"tx_rates":null}
16:44:29.340 | INFO     | infra.rates:get_metrics:73 - No tx rate metrics found...
16:44:29.340 | ERROR    | infra.runner:run:204 - Stopping clients due to exception
16:44:29.341 | INFO     | infra.remote:stop:483 - [127.60.23.42] closing
16:44:29.341 | ERROR    | infra.remote:log_errors:80 - Contents of /home/rengolin/devel/msrc/CCF/build/workspace/ls_sgx_cft^_client_0/err:
scenario_perf_client: malloc.c:2379: sysmalloc: Assertion `(old_top == initial_top (av) && old_size == 0) || ((unsigned long) (old_size) >= MINSIZE && prev_inuse (old_top) && ((unsigned long) old_end & (pagesize - 1)) == 0)' failed.

I'm guessing being run as root means it tried to run some benchmarks, and I didn't have the infra?

achamayou commented 3 years ago

@rengolin the check suid script looks good, but results in the tests being run as root all the time. We only need/want that for the partition test, which is why it's split in CI (https://github.com/microsoft/CCF/blob/main/.azure-pipelines-templates/test.yml#L17).

For the bulk of the tests, having the logs created with sensible permissions is desirable.

We do not run the performance tests against virtual builds in CI (or elsewhere), but they ought to work, I'll have a look.

rengolin commented 3 years ago

@rengolin the check suid script looks good, but results in the tests being run as root all the time. We only need/want that for the partition test, which is why it's split in CI (https://github.com/microsoft/CCF/blob/main/.azure-pipelines-templates/test.yml#L17).

Yes, that was going to be my next step: look at how to only do that for the one test that needs it. I'll check the yaml file.

We do not run the performance tests against virtual builds in CI (or elsewhere), but they ought to work, I'll have a look.

Well, I wasn't trying to run them, I just ran TEST_ENCLAVE=virtual ./tests.sh --output-on-failure again (which then, used sudo) and they picked up.

Basically what I'm aiming for is to have a subset of test that will always pass for anyone running on a local build. If that's using different arguments, or just not running some tests (as long as CI does it), then it will be up to what the test are and what we can do with them.

If there is such a subset, then perhaps we should just update the docs, so that people don't get surprised when some tests fail.

microsoft / CCF

Four test failures on clean build #2996