Closed elboulangero closed 10 months ago
Hi @elboulangero
Great you're looking at this. We've added quite a few tests recently, including the failing ones above.
It's always possible the timeouts are due to coding errors of course, or even code which could do with being optimised. On my x86_64 machine (Old 3rd Gen Xeon E3), the common tests including the RSA test take 150ms to run and the bitmap tests take 500ms to run. The RSA test in particular seems way out - I can't see my old machine being 40 times quicker than an Alpha, unless you've got a particularly old one there.
Have you tried the timeouts for the Alpha? Do they work?
Hello @matt335672 ! All I tried so far is that the PR #2214 builds (no compilation error). I don't know yet if it solves the timeouts, as I can only test locally on x86_64 (ie. my own machine). However what I can do is try to push this patch in Debian, so that a new version of the Debian package gets built on all the builders for all the architectures. It's probably the simpler way to see if the patch works, so I'm going to do that and submit the patch to the Debian maintainer.
the common tests including the RSA test take 150ms to run and the bitmap tests take 500ms to run
The time it takes for the tests is not only related to the hardware, but also how busy the machine is. The machines that build Debian packages would typically build and test several packages at once, so the tests might take much longer due to that. Also for the RSA key generation, I think the time it takes to generate a key also depends on the entropy available in the system. If there's not enough entropy, I guess the call just waits until there's enough.
That's why I proposed to set a very long timeout for this one (60 seconds).
Have you tried the timeouts for the Alpha? Do they work?
Let me see if the Debian maintainer wants to accept this patch.
That sounds like a sensible way forward. If we've still got a problem, we can consider other ways to look into it.
I'm not super keen on increasing the timeout for all the tests in the suite however, as it may mask other problems. We can always move these particular tests to their own suite with their own timeouts if they are slow on Alpha.
I'm not super keen on increasing the timeout for all the tests in the suite however, as it may mask other problems
So just to clarify: that's not what I propose either. The PR #2214 is about increasing timeouts for some test cases. Each test case might contain more than one test. Apparently it's not possible to increase the timeout per test: the function tcase_set_timeout()
acts on all the tests within the test case.
Cf https://libcheck.github.io/check/doc/check_html/check_4.html#Test-Timeouts:
The timeout for a specific test case, which may contain multiple unit tests, can be changed with the tcase_set_timeout() function.
Also of interest in this doc:
The default timeout used for all test cases can be changed with the environment variable CK_DEFAULT_TIMEOUT
It's good to know, but it's not what I propose. Since only a few particular tests have been seen to fail, I think it's a better approach to target those tests in particular.
Anyway, I'll let you know if/when the patch is accepted in Debian, and if it fixes the issues. Thanks for your quick feedback!
Sorry @elboulangero - I'm getting my terms mixed up. I think we need to separate out the unit tests which may take longer into separate cases, rather than increasing the timeouts for all the cases we currently have. So I think we're really talking about the same thing.
@matt335672 I understand your point now, thanks for clarifying! Indeed we're on the same line
[Copied from #2214]
Some more analysis.
At the moment, all the SSL calls are lumped together in one test suite which has a common timeout.
I can reproduce this by running the test suite under valgrind to simulate a slow machine. What I'm finding is that pretty much all the time is being taken up in ssl_gen_key_xrdp1(). Drilling into this, all the time is being generated in the call to EVP_PKEY_generate() (i.e. in the SSL library call).
I've got a Xeon E3-1275 v3 which is about a decade old now. On that, the call to EVP_PKEY_generate() normally takes 150ms, but amazingly takes 15-30 seconds under valgrind. The next slowest test under valgrind in this suite is test_sha1_hash_ok() at 112ms.
I'll put together a more nuanced version of this PR to split the test case up a bit and add an extended timeout to the ssl_gen_key_xrdp1() case only.
Hi, recently xrdp 0.19 was uploaded to Debian. Some tests timed out on some architectures: https://buildd.debian.org/status/package.php?p=xrdp&suite=sid
armhf architecture - test_gen_key_xrdp1 timed out
These tests were introduced in
v0.9.19
.From what I see, this test generates a RSA key, so it's very possible (and legitimate) that it times out, I suppose. What about using tcase_set_timeout for this test, and set it to something real long, like 1 minute? It seems that by default, the timeout is 4 seconds.
Complete build logs at https://buildd.debian.org/status/fetch.php?pkg=xrdp&arch=armhf&ver=0.9.19-1&stamp=1648811785&raw=0
alpha architecture - test_bitmap_loadmaxwidth{zoom,scale}ok
These tests were introduced in
v0.9.18
, and Debian jumped straight fromv0.0.17
tov0.0.19
, so we hit those timeouts only now.I'm not sure, but it seems that the Alpha architecture is long dead, so it's not super important. Still, it wouldn't take much to fix those tests, just add a
tcase_set_timeout
, I guess.Complete build logs at https://buildd.debian.org/status/fetch.php?pkg=xrdp&arch=alpha&ver=0.9.19-1&stamp=1648818829&raw=0
Patch is on the way
Please tell me what you think, if you're ok with tcase_set_timeout, how long a timeout you'd like, and I'll submit a patch asap. Thanks!
EDIT: That was straightforward so I already opened a merge request: https://github.com/neutrinolabs/xrdp/pull/2214