Closed ggwpez closed 1 year ago
Rust does not have built-in setup/teardown support for test fixtures. But that does not mean we cannot use something like rstest
. We seem to be using thread-local-storage for just easier fixture setup in these mocks and we want to use a fresh fixture for every test case, whatever thread they are running in.
We seem to be using thread-local-storage for just easier fixture setup in these mocks and we want to use a fresh fixture for every test case, whatever thread they are running in.
Some of these locals are accessed in callbacks and not just for closure I think, so they need to be accessible in the scope surrounding the #[test]
functions.
AFAIU the rstest
only passes arguments to the test functions, right? @wigy-opensource-developer
One solution that I would not be particularly proud of:
Create a define_thread_locals!
macro to define all thread locals and creates a clear_thread_locals!
macro to clear all RefCell
with RefCell::new(Default::default())
. This could then be called in new_test_ext().execute_with
.
PS: Would be the same construction as used in https://github.com/paritytech/substrate/pull/10592
@ggwpez but what would be the final outcome we achieved here for additional complexity and work? Does not seem that really anyone executes tests like this anyway, which is why this is the first time it is being reported.
Some of these locals are accessed in callbacks and not just for closure I think, so they need to be accessible in the scope surrounding the
#[test]
functions. AFAIU therstest
only passes arguments to the test functions, right?
If we want independent tests, we need to create a fresh fixture for each of them. Because the Rust test runner starts each #[test]
on a fresh stack, it is only static and thread local state from mock.rs that can make the tests interact with each other making them erratic. (I try to use terms correctly from http://xunitpatterns.com/ )
but what would be the final outcome we achieved here for additional complexity and work? Does not seem that really anyone executes tests like this anyway, which is why this is the first time it is being reported.
Well, this is kind of what I meant with
Please confirm and decide if this should be fixed.
I'm not sure if it is worth it. Might be worth testing on a single threaded system, to see it it happens there as well @shawntabrizi.
PS: It does not compile with the current Tarpaulin version, but if it turns out not working this could be a reason to fix it.
For example when I look at the tests for pallet-assets, I do not see how this could be done with test closures since they would need to be accessible for the FrozenBalance
balance impl here. @wigy-opensource-developer
Yeah. So because the FrozenBalance
trait does not take a proper &self
or &mut self
parameter on their methods, what can we do but use static state, right? Well, wrong!
The only reasonable way forward is to not treat every type definition in a #[pallet::config]
a global variable. We hire Rust developers, but this is a completely different, limited language we can use in these macros, causing all kind of gotchas like this.
This should be completely viable for implementing FrozenBalance
:
#[derive(Debug, Default)]
pub struct TestFreezer {
frozen: HashMap<(u32, u64), u64>,
hooks: Vec<Hook>,
}
impl FrozenBalance<u32, u64, u64> for TestFreezer {
fn frozen_balance(&self, asset: u32, who: &u64) -> Option<u64> {
self.frozen.get(&(asset, who.clone())).cloned()
}
fn died(&mut self, asset: u32, who: &u64) {
self.hooks.push(Hook::Died(asset, who.clone()));
}
}
impl TestFreezer {
pub(crate) fn set_frozen_balance(&mut self, asset: u32, who: u64, amount: u64) {
self.frozen.insert((asset, who), amount);
}
pub(crate) fn clear_frozen_balance(&mut self, asset: u32, who: u64) {
self.frozen.remove(&(asset, who));
}
pub(crate) fn hooks(&self) -> Vec<Hook> {
self.hooks.clone()
}
}
I do not want to hold you back from fixing it. Just for me personally the priority of working on it is not be the highest.
You can open a MR if you want to fix it.
@wigy-opensource-developer
@shawntabrizi What I understand from the FRAME design is that all state is in the storage, and in order to emphasize the "statelessness" (purity) of the block function composed out of all this code, self
is missing from all these things intentionally. The missing piece is testability, which requires dependency injection. Although this is an article about F#, but writer on several books on the topic, Mark Seemann describes why just for the sake of testability we need to allow impurity in the final assembled runtime: https://blog.ploeh.dk/2017/01/30/partial-application-is-dependency-injection/
Using TLS for tests looks like an anti-pattern to me. Why not just use the storage? Something like this should work:
pub struct TestFreezer;
impl FrozenBalance<u32, u64, u64> for TestFreezer {
fn frozen_balance(asset: u32, who: &u64) -> Option<u64> {
let key: u128 = (asset as u128) << 64 | *who as u128;
sp_io::storage::get(&key.to_le_bytes()).map(|x| u64::from_le_bytes(x.as_slice().try_into().unwrap()))
}
}
pub(crate) fn set_frozen_balance(asset: u32, who: u64, amount: u64) {
let key: u128 = (asset as u128) << 64 | who as u128;
sp_io::storage::set(&key.to_le_bytes(), &amount.to_le_bytes());
}
pub(crate) fn clear_frozen_balance(asset: u32, who: u64) {
let key: u128 = (asset as u128) << 64 | who as u128;
sp_io::storage::clear(&key.to_le_bytes());
}
Alternatively let key = (asset, who).encode()
Using TLS for tests looks like an anti-pattern to me. Why not just use the storage? Something like this should work
Will you create a fresh storage for each and every test function? Otherwise you end up with the same interacting tests.
Each test already uses a unique instance of TestExternalities
which holds the storage. There should be no interference.
Yea there are still crates failing. Anyway we can do this cleanup lean step by step.
Does not reproduce on rust 1.70 anymore. Neither the pallet tests not the example code fail on one test thread 🎉
Observation
Many tests fail when run with
--test-threads 1
. I only tried to run thepallet
tests, eg:It also seems to happen with
--features=runtime-benchmarks
.Maybe someone can check it for tests outside of
pallet-*
. Polkadot and Cumulus took too long on my machine to compile.Assumed problem(s)
Thread local vars
The first test that fails (
assets/freezer_should_work
) should be fixed after #10473 by line which resets a dirty thread local var.I first spotted it by running Tarpaulin on Substrate, but Tarpaulin was not the problem.
The fact that the test used thread local vars and tarpaulin implicitly forces
--test-threads 1
seems to cause at least some of the failures.The following tests only fails with
--test-threads 1
:As far as I understand it:
cargo test
normally schedules the tests on multiple threads, which means that theSTICKY
var is false for each#[test]
.When
--test-threads 1
is used, the tests run consecutively and the second test that is executed gets the sticky/dirty value of the first test function; causing the failure.Other causes
I did not have the time to look into the other failures to see if they could all be failing because of dirty thread locals.
Maybe there is more amiss.
Next
Please confirm and decide if this should be fixed.
Theoretically this could also be changed in
cargo test
, such that it resets thread locals before each#[test]
:thinking: