Adjust feature_no_esperanza_tx_relay_delay

AM5800 commented 5 years ago

This test failed in graphene branch, but I am 100% certain that the failure is not connected to it.

2019-04-01 10:27:23.365000 TestFramework (INFO): Test outbound tx relay 5 times. mean: 2.28 sec, median: 1.04 sec
2019-04-01 10:27:40.739000 TestFramework (INFO): Test inbound tx relay 5 times. mean: 1.557 sec, median: 0.366 sec
2019-04-01 10:27:50.644000 TestFramework (INFO): Test outbound vote relay 5 times. mean: 0.212 sec, median: 0.217 sec
2019-04-01 10:27:57.047000 TestFramework (INFO): Test inbound vote relay 5 times. mean: 0.207 sec, median: 0.215 sec
2019-04-01 10:27:57.047000 TestFramework (ERROR): Assertion failed
Traceback (most recent call last):
  File "/home/travis/build/dtr-org/unit-e/build/unit-e-x86_64-unknown-linux-gnu/test/functional/test_framework/test_framework.py", line 156, in main
    self.run_test()
  File "/home/travis/build/dtr-org/unit-e/build/unit-e-x86_64-unknown-linux-gnu/test/functional/feature_no_esperanza_tx_relay_delay.py", line 193, in run_test
    assert median(inbound_vote_delays) < median(inbound_delays) / 3
AssertionError

full log

Main difficulty of this test is to exactly determine this: assert median(inbound_vote_delays) < median(inbound_delays) / 3 We know that votes should propagate faster, but how much? What threshold should we define in test? It turns out that this /3 is way too optimistic.

In an attempt to determine decent value I have created a Monte-Carlo-like synthetic test (I have also did some probability computations, but I think that Monte-Carlo is better/cleaner).

We assume vote propagation time to be constant and equal to 0.2s. (This value I observed on travis and mean seems to be very consistent, so I think it is ok to model it like this). To find regular transaction propagation time we just use original PoissonNextSend function (hence generator is in cp)

	/3	/2	/1
Poisson mean = 2	0.955	0.992	1.00
Poisson mean = 5	0.996	0.999	1.00

Table shows probability to pass the test given the "/X" multiplier in the check above. generator source(cpp)

Based on results above I am switching to: assert mean(inbound_vote_delays) < mean(inbound_delays) Plus I have also added some debug logs.

Signed-off-by: Aleksandr Mikhailov aleksandr@thirdhash.com

cornelius commented 5 years ago

Interesting, when going to the "Commits" tab of this PR, your name is not clickable.

That's probably because of using an email address in the git commit which is not known to GitHub.

frolosofsky commented 5 years ago

It looks like we're switching from "too optimistic" scenario to "better than nothing" one. There's plenty of numbers in between 1/1 and 1/2, like 2/3 or 3/4. Maybe give them a chance?

AM5800 commented 5 years ago

@frolosofsky there are 3 options for this test: 1) Flaky 2) Slow 3) Weak check

I am running more experiments, but it seems to me that we are tending to 3 anyway

AM5800 commented 5 years ago

I intentionally broke test to take a look on it's output in "real" travis. Setting WIP tag for now

AM5800 commented 5 years ago

Closing in favor of #878

dtr-org / unit-e

Adjust feature_no_esperanza_tx_relay_delay #875