Open bryder opened 2 years ago
Are you sure that it is the number of keys that is the problem? There are a few tests that test the max key size (usually one page) that trigger those.
Unfortunately, Rust's test harness doesn't support much runtime introspection, so doing anything about it at runtime isn't really easy. The quota-hitting tests could be put behind a #[serial]
attribute though. Other than that, docs are all I think I can really offer.
I'm reasonably sure it is the key count - but I'll do some experiments and report my results.
From what I saw random tests were failing - but I didn't carefully record exactly which tests failed.
I do know if I set the limit to 1000 the tests never failed (I ran 3 or 4 runs to check it). But I'll do more exhaustive testing.
Also - thanks for doing such a great job with this crate! I'm particularly impressed with the test suite.
Using this in check_maxkeys.bash:
#!/usr/bin/bash
set_maxkeys(){
local maxkeys="${1:?"You must set the maxkeys count as \$1"}"
local maxkeys_file="/proc/sys/kernel/keys/maxkeys"
echo "${maxkeys}" | sudo tee ${maxkeys_file}
echo "Maxkeys: $(< ${maxkeys_file})"
}
run_tests(){
# only run the unittests - the binary tests never fail
local output="$(cargo test --lib)"
echo "${output}"
}
for (( i=200; i < 1000; i+=100)) ; do
set_maxkeys $i
run_tests
done
The result was
./check_maxkeys.bash |& grep -e Maxkeys -e 'test result:'
Maxkeys: 200
test result: FAILED. 79 passed; 89 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.31s
Maxkeys: 300
test result: FAILED. 112 passed; 56 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.31s
Maxkeys: 400
test result: FAILED. 137 passed; 31 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.31s
Maxkeys: 500
test result: ok. 168 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 2.18s
Maxkeys: 600
test result: ok. 168 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 2.19s
Maxkeys: 700
test result: ok. 168 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 2.22s
Maxkeys: 800
test result: ok. 168 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 2.09s
Maxkeys: 900
test result: ok. 168 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 2.16s
So it is dependent on the maxkeys setting as far as I can see.
It always works at 500, and at 400 hundred or less I get a varying number of failures on my system.
And the error is always
panicked at 'called `Result::unwrap()` on an `Err` value: Errno { code: 122, description: Some("Disk quota exceeded") }'
OK, seems that does correlate. Do you mind making some measurements as to where the limit should be for various -j
settings to succeed with, say 80% probability? We can then document it (and maybe even make a script to generate the argument based on the runtime value).
Using this
#!/usr/bin/bash
maxkeys_file="/proc/sys/kernel/keys/maxkeys"
set_maxkeys(){
local maxkeys="${1:?"You must set the maxkeys count as \$1"}"
echo "${maxkeys}" | sudo tee ${maxkeys_file} >/dev/null
}
# shellcheck disable=SC2120
run_tests(){
# only run the unittests - the binary tests never fail
local thread_count="${1:-}"
local thread_clause=""
if [[ -n "${thread_count}" ]] ; then
thread_clause="--test-threads=${thread_count}"
fi
local output="$(cargo test --lib -- ${thread_clause} 2>&1 )"
echo -n "Maxkeys: $(< ${maxkeys_file}) "
echo "${output}" |& grep 'test result:'
}
do_tests_work(){
local thread_count="${1:-}"
test_output="$(run_tests "${thread_count}")"
local thread_count_desc
if [[ -z "${thread_count}" ]]; then
thread_count_desc="default"
else
thread_count_desc="${thread_count}"
fi
if [[ "${test_output}" =~ ok ]] ; then
echo "${test_output}"
echo "OK at ${thread_count_desc} threads ${maxkeys} maxkeys"
echo
return 0
else
echo "${test_output}"
return 1
fi
}
run_maxkeys_sweep(){
local thread_count="${1:-}"
if [[ -z "${thread_count}" ]] ; then
echo "Thread count not set "
else
echo "Thread count set: ${thread_count}"
fi
for ((maxkeys = 100; maxkeys < 1000; maxkeys += 100)); do
set_maxkeys $maxkeys
if do_tests_work ${thread_count} ; then
break
fi
done
}
# Test when no threadcount is set
run_maxkeys_sweep ""
for ((thread_count = 1 ; thread_count < 20; thread_count+=2)); do
run_maxkeys_sweep "${thread_count}"
done
I get this
Thread count not set
Maxkeys: 100 test result: FAILED. 65 passed; 103 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.13s
Maxkeys: 200 test result: FAILED. 80 passed; 88 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.34s
Maxkeys: 300 test result: FAILED. 112 passed; 56 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.25s
Maxkeys: 400 test result: FAILED. 137 passed; 31 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.21s
Maxkeys: 500 test result: ok. 168 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 2.13s
OK at default threads 500 maxkeys
Thread count set: 1
Maxkeys: 100 test result: FAILED. 138 passed; 30 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.86s
Maxkeys: 200 test result: ok. 168 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 5.20s
OK at 1 threads 200 maxkeys
Thread count set: 3
Maxkeys: 100 test result: FAILED. 54 passed; 114 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.21s
Maxkeys: 200 test result: FAILED. 131 passed; 37 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.50s
Maxkeys: 300 test result: ok. 168 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 2.57s
OK at 3 threads 300 maxkeys
Thread count set: 5
Maxkeys: 100 test result: FAILED. 59 passed; 109 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.14s
Maxkeys: 200 test result: FAILED. 80 passed; 88 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.29s
Maxkeys: 300 test result: ok. 168 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 2.36s
OK at 5 threads 300 maxkeys
Thread count set: 7
Maxkeys: 100 test result: FAILED. 62 passed; 106 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.15s
Maxkeys: 200 test result: FAILED. 80 passed; 88 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.26s
Maxkeys: 300 test result: FAILED. 135 passed; 33 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.31s
Maxkeys: 400 test result: FAILED. 139 passed; 29 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.28s
Maxkeys: 500 test result: ok. 168 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 2.32s
OK at 7 threads 500 maxkeys
Thread count set: 9
Maxkeys: 100 test result: FAILED. 65 passed; 103 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.22s
Maxkeys: 200 test result: FAILED. 80 passed; 88 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.22s
Maxkeys: 300 test result: FAILED. 127 passed; 41 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.29s
Maxkeys: 400 test result: ok. 168 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 2.30s
OK at 9 threads 400 maxkeys
Thread count set: 11
Maxkeys: 100 test result: FAILED. 66 passed; 102 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.23s
Maxkeys: 200 test result: FAILED. 80 passed; 88 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.31s
Maxkeys: 300 test result: FAILED. 114 passed; 54 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.32s
Maxkeys: 400 test result: FAILED. 138 passed; 30 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.27s
Maxkeys: 500 test result: ok. 168 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 2.11s
OK at 11 threads 500 maxkeys
Thread count set: 13
Maxkeys: 100 test result: FAILED. 66 passed; 102 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.32s
Maxkeys: 200 test result: FAILED. 80 passed; 88 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.14s
Maxkeys: 300 test result: FAILED. 112 passed; 56 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.18s
Maxkeys: 400 test result: FAILED. 138 passed; 30 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.24s
Maxkeys: 500 test result: ok. 168 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 2.24s
OK at 13 threads 500 maxkeys
Thread count set: 15
Maxkeys: 100 test result: FAILED. 65 passed; 103 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.34s
Maxkeys: 200 test result: FAILED. 84 passed; 84 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.36s
Maxkeys: 300 test result: FAILED. 113 passed; 55 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.26s
Maxkeys: 400 test result: FAILED. 136 passed; 32 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.23s
Maxkeys: 500 test result: ok. 168 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 2.21s
OK at 15 threads 500 maxkeys
Thread count set: 17
Maxkeys: 100 test result: FAILED. 65 passed; 103 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.20s
Maxkeys: 200 test result: FAILED. 80 passed; 88 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.36s
Maxkeys: 300 test result: FAILED. 113 passed; 55 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.21s
Maxkeys: 400 test result: FAILED. 137 passed; 31 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.27s
Maxkeys: 500 test result: ok. 168 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 2.10s
OK at 17 threads 500 maxkeys
Thread count set: 19
Maxkeys: 100 test result: FAILED. 64 passed; 104 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.39s
Maxkeys: 200 test result: FAILED. 93 passed; 75 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.31s
Maxkeys: 300 test result: FAILED. 113 passed; 55 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.21s
Maxkeys: 400 test result: FAILED. 136 passed; 32 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.27s
Maxkeys: 500 test result: ok. 168 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 2.20s
OK at 19 threads 500 maxkeys
Even if I run it at 100 threads
Thread count set: 100
Maxkeys: 100 test result: FAILED. 57 passed; 111 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.08s
Maxkeys: 200 test result: FAILED. 91 passed; 77 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.23s
Maxkeys: 300 test result: FAILED. 106 passed; 62 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.23s
Maxkeys: 400 test result: FAILED. 140 passed; 28 failed; 0 ignored; 0 measured; 0 filtered out; finished in 2.06s
Maxkeys: 500 test result: ok. 168 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 2.03s
OK at 100 threads 500 maxkeys
So 500 maxkeys seems to always work - but that will change as more tests are added so I wouldn't try to predict it myself.
You need a minimum of 200 keys at the moment - even with 1 thread.
You don't get much better run time as the thread count goes much past one.
500 works even at 100 threads.
Personally I don't think it's worth looking for an 80% threshold.
When running tests I want a 100% success rate - I've had to deal with tests with races which sometimes work and they are very problematic. You end up trying to find a bug in your code change when the bug was in the test all along.
I'd be inclined to either recommend setting maxkeys to 500 or run the test at 1 thread (and check maxkeys is at least 200)
Thanks! A note in the README sounds good to me. CI already does one thread and it's all "the environment" which isn't something that should be poked at too much by a test suite. Feel free to send a PR if I don't get to it today.
Problem
When running
cargo test
in threaded mode random tests fail with:libc::EDQUOT
ieSome("Disk quota exceeded")
It would be good to put a note about this somewhere (or maybe read maxkeys and warn on it?)
Verison
Tested on this revision on master: e4b35b614af249bf1fbec7a9d2c0a662009c2b01
Environment
Ubuntu 20.04 under WSL2 on Windows 11.
The machine reports 12 procs.
Cause and 'fix'
The cause is the default user quota is too small. This is the default
One way to fix it is
The other way is to limit the threads cargo runs - but I had to set it to 1 to work. Even with 2 random tests failed.