Closed muzarski closed 1 month ago
Rebased on master.
The async test I enabled started to fail (even when I only rebased on master). I can't reproduce it locally:
I'll update an integration test to make use of cass_future_wait
instead of cass_future_wait_timed
to see if the issue is with the timed version itself, or not.
I changed wait_timed
to wait
in integration test. There are multiple client timeout errors:
/home/runner/work/cpp-rust-driver/cpp-rust-driver/tests/src/integration/objects/future.hpp:152: Failure
Expected: CASS_OK
To be equal to: wait_code
Which is: CASS_ERROR_LIB_REQUEST_TIMED_OUT [Request timed out]
Request timed out: Request timeout: Request took longer than 12000ms: deadline has elapsed
Link to the job: https://github.com/scylladb/cpp-rust-driver/actions/runs/10201719963/job/28224244570?pr=153#step:7:8477.
Recently, we merged a PR that sets a default client timeout to 12000ms
(which is huge!). I'm not sure why the timeouts appear - these are simple insert queries. Link to the test case: https://github.com/scylladb/cpp-rust-driver/blob/master/tests/src/integration/tests/test_async.cpp#L99-L111.
I'll try to investigate further, but it seems that the test failure is not related to cass_future_wait_timed
. The test fails due to client request timeouts. As I mentioned previously, the test passes for me locally. Maybe it's some GH actions issue - query timing out after 12s is really weird.
cc: @Lorak-mmk @wprzytula
The async test I enabled started to fail (even when I only rebased on master). I can't reproduce it locally:
* for scylla tests, my local version of ccm for some reason can't create a 5.0.0 scylla cluster
What is the error? also we should update to newer version / test with more than 1 version.
* for cassandra - the test passes
I'll update an integration test to make use of
cass_future_wait
instead ofcass_future_wait_timed
to see if the issue is with the timed version itself, or not.
I changed
wait_timed
towait
in integration test. There are multiple client timeout errors:/home/runner/work/cpp-rust-driver/cpp-rust-driver/tests/src/integration/objects/future.hpp:152: Failure Expected: CASS_OK To be equal to: wait_code Which is: CASS_ERROR_LIB_REQUEST_TIMED_OUT [Request timed out] Request timed out: Request timeout: Request took longer than 12000ms: deadline has elapsed
Link to the job: https://github.com/scylladb/cpp-rust-driver/actions/runs/10201719963/job/28224244570?pr=153#step:7:8477.
Recently, we merged a PR that sets a default client timeout to
12000ms
(which is huge!). I'm not sure why the timeouts appear - these are simple insert queries. Link to the test case: https://github.com/scylladb/cpp-rust-driver/blob/master/tests/src/integration/tests/test_async.cpp#L99-L111.I'll try to investigate further, but it seems that the test failure is not related to
cass_future_wait_timed
. The test fails due to client request timeouts. As I mentioned previously, the test passes for me locally. Maybe it's some GH actions issue - query timing out after 12s is really weird.cc: @Lorak-mmk @wprzytula
So previously there was no client request timeout at all? And now that you introduced it, queries are timing out? Tests are running in valgrind, so it's possible for queries to take a longer time, but it shouldn't be this long. When running locally, do you also use valgrind?
So previously there was no client request timeout at all? And now that you introduced it, queries are timing out? Tests are running in valgrind, so it's possible for queries to take a longer time, but it shouldn't be this long. When running locally, do you also use valgrind?
Good thing that you noticed that. I missed it. Indeed, when I run the tests locally under valgrind, the timeouts appear.
The async test I enabled started to fail (even when I only rebased on master). I can't reproduce it locally:
* for scylla tests, my local version of ccm for some reason can't create a 5.0.0 scylla cluster
What is the error?
Nvm, it's not ccm issue. I was just missing openjdk-11
locally which is needed to setup 5.0.0
cluster apparently.
also we should update to newer version / test with more than 1 version.
What version would you suggest? 5.4.8?
The async test I enabled started to fail (even when I only rebased on master). I can't reproduce it locally:
* for scylla tests, my local version of ccm for some reason can't create a 5.0.0 scylla cluster
What is the error?
Nvm, it's not ccm issue. I was just missing
openjdk-11
locally which is needed to setup5.0.0
cluster apparently.
I recommend using ccm by nix, it removed a lot of my frustrations with ccm.
also we should update to newer version / test with more than 1 version.
What version would you suggest? 5.4.8?
6.0
v2: addressed review comments
There are still timeouts appearing. The conclusion is that the timeouts appear due to tests being run under valgrind
(reproducing locally as well). I see two solutions here:
valgrind
(right now, there is only one failing test due to timeouts).wait_timed
to wait
).I think the former approach is better, as it allows us to test cass_future_wait_timed
. WDYT? @Lorak-mmk @wprzytula
v2: addressed review comments
There are still timeouts appearing. The conclusion is that the timeouts appear due to tests being run under
valgrind
(reproducing locally as well). I see two solutions here:* Run the timeout-sensitive tests in a separate CI step without `valgrind` (right now, there is only one failing test due to timeouts). * Modify the failing test so it disables the global client timeout, and waits for the future indefinitely (change `wait_timed` to `wait`).
I think the former approach is better, as it allows us to test
cass_future_wait_timed
. WDYT? @Lorak-mmk @wprzytula
In second approach, if you use wait instead of wait_timed, then how does it test wait_timed?
v2: addressed review comments There are still timeouts appearing. The conclusion is that the timeouts appear due to tests being run under
valgrind
(reproducing locally as well). I see two solutions here:* Run the timeout-sensitive tests in a separate CI step without `valgrind` (right now, there is only one failing test due to timeouts). * Modify the failing test so it disables the global client timeout, and waits for the future indefinitely (change `wait_timed` to `wait`).
I think the former approach is better, as it allows us to test
cass_future_wait_timed
. WDYT? @Lorak-mmk @wprzytulaIn second approach, if you use wait instead of wait_timed, then how does it test wait_timed?
That's the point - it does not. That's why I prefer to run this specific test without valgrind.
Sorry, I misread your message. Separate step complicates the CI, but in this case it seems reasonable to me.
v3: running the failing async test in a separate CI step without valgrind
I'll update Scylla version in CI in a separate PR.
v3.1: deduplicate error message variable in callback test
v3.2: fix CI step description for timeout-sensitive cassandra integration tests
Rebased on master
Rebased on master
v4: added proper synchronization between cass_future_wait
and cass_future_wait_timed
callers.
v4.1:
v4.2:
ref: https://github.com/scylladb/cpp-rust-driver/issues/42
This PR implements a
cass_future_wait_timed
API function which allows to wait for a future to resolve within a given period of time.Pre-review checklist
.github/workflows/build.yml
ingtest_filter
..github/workflows/cassandra.yml
ingtest_filter
.