Very long build time - Githubissues

smessmer commented 1 year ago

I added some rstest based parameterized tests to a project, and suddenly the incremental build time is very long. I just change one line in the file, but cargo needs 5-10 minutes to incrementally compile it. I had been planning to add several more tests like this, but build time seems to linearly grow with it and would quickly be several hours.

Admittedly, this is generating quite a few test cases because of a large cartesian product of test parameters (32 x 6 x 3 x 3 = 1728 test cases), but I still wouldn't have expected it to take that long to build. Particularly troublesome seems to be this test: https://github.com/cryfs/cryfs/blob/6161c8c89e35892a8c3687cc8f53685b08b2febf/src/blockstore/rust/blobstore/src/on_blocks/data_tree_store/tree.rs#L1306-L1322

Do you know any tips on how I could speed up build times here? I tried:

Using a different linker. For most of the build time, it only uses a single core so I assumed it would be the linker, but even using lld or mold doesn't speed it up significantly. Maybe it actually wasn't the linker after all but it's the rstest macro taking so much time?
Splitting tests into different submodules in different files, but that still takes the same time to build and still builds on only one core.
Moving all the code of the unit test into a separate function so that the actual test function with all the rstest attributes is just a one-liner calling into that other function and there isn't as much code that needs to be duplicated. This also didn't make a difference in the build time.

If you want to reproduce, clone the repo and commit linked above and run "cargo test" from the "src/blockstore/rust/blobstore" directory.

la10736 commented 1 year ago

Unfortunately rstest uses the standard rust test runner and adding a test means adding a new #[test] function. So we should pay attention when we use cartesian products....

I think that there is nothing that you can do better than this if you really want an independent test for each single case....

Il giorno sab 11 mar 2023 alle ore 08:34 Sebastian Meßmer < @.***> ha scritto:

I added some rstest based parameterized tests to a project, and suddenly the incremental build time is very long. I just change one line in the file, but cargo needs 5-10 minutes to incrementally compile it. I had been planning to add several more tests like this, but build time seems to linearly grow with it and would quickly be several hours.

Admittedly, this is generating quite a few test cases because of a large cartesian product of test parameters (32 x 6 x 3 x 3 = 1728 test cases), but I still wouldn't have expected it to take that long to build. Particularly troublesome seems to be this test: https://github.com/cryfs/cryfs/blob/feature/rust/src/blockstore/rust/blobstore/src/on_blocks/data_tree_store/tree.rs#L1306-L1322

Di you know any tips on how I could speed up build times here? For most of the build time, it only uses a single core so I assumed it would be the linker, but even using lld or mold doesn't speed it up significantly.

If you want to reproduce, clone the repo and commit linked above and run "cargo test" from the "src/blockstore/rust/blobstore" directory.

— Reply to this email directly, view it on GitHub https://github.com/la10736/rstest/issues/184, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA5Y34JYUHJG6TS367O64LLW3QTJPANCNFSM6AAAAAAVXKUTGU . You are receiving this because you are subscribed to this thread.Message ID: @.***>

smessmer commented 1 year ago

Thanks for your fast reply. Do you know what it actually is that is taking so much time here? Is it expansion of the #[rstest] macros? Expansion of the #[tokio::test] macros? The #[test] macros below that? Or is it linking everything after expansion? I'm not sure how to profile this since cargo --timings just shows one entry for the whole thing (which initially made me believe it's the linker but I'm not sure about that anymore).

I feel like there must be something I can do to ideally speed compile times up, or at least make the compilation process use more than one core.

la10736 commented 1 year ago

I don't think that the macro expansion take too much time (the tokio one is really trivial) but I bet on the linker that should link tons lines of code.

To profile the compilation you can take a look at https://rustc-dev-guide.rust-lang.org/profiling.html. Anyway, if the most of the time is spending in the last stage maybe is compile and linking your test code. You can try also this https://rustc-dev-guide.rust-lang.org/parallel-rustc.html but I don't think that it can really help :cry:

smessmer commented 1 year ago

I investigated a bit more. Looks like it's mostly because rstest duplicates the boilerplate of #[tokio::test] and the number of lines becomes just so large that both macro expansion and compilation take a long time (cargo expand also takes a long time so I don't think it's just linking).

rstest with #[tokio:test] generates something like

fn test_case_1() {
  let body = async {
    let ..params.. = ..;
    test_func().await
  };
  #[allow(clippy::expect_used)]
  tokio::runtime::Builder::new_current_thread()
     .enable_all()
     .build()
     .expect("Failed building the Runtime")
     .block_on(body)
}

and rstest then duplicates that logic into the test case for every single parameter combination instead of factoring it out into a common function for all parameter combinations of the same test.

I fixed it in my code by using #[test] instead of #[tokio::test] and then writing the tokio setup logic inside of my test case, using the following macro:

macro_rules! run_tokio_test {
    ($code:block) => {
        tokio::runtime::Builder::new_current_thread()
            .enable_all()
            .build()
            .unwrap()
            .block_on(async move {$code});
    };
}

and then, instead of this

#[rstest]
...#[cases]...
#[tokio::test]
async fn my_test(
  ...#[values]...
) {
  ... code ...
}

do this:

#[rstest]
...#[cases]...
#[test]
fn my_test(
  ...#[values]...
) {
  run_tokio_test! ({
    ... code ...
  });
}

With this approach, rstest doesn't get a chance to duplicate anymore since it's now part of the test code itself and rstest only duplicates the much less verbose #[test] attribute.

I introduced this macro in this commit and:

The fix reduced compile time of cargo build --tests --features slow-tests-1 --timings from 11 minutes down to just 1 minute.
- Note that the slow-tests-1 feature only builds about 30% of the test cases from that file, there are features slow-tests-{2,3,4} as well.
- Before the fix, building all the tests with --all-features just crashed the compiler with a sigkill, probably out of memory.
- After the fix, --all-features successfully builds and takes 2.5 minutes.
The fix reduced number of lines of code generated by this module, as output by cargo expand --lib --tests --all-features, from 2,068,274 down to 1,484,944.
And all this while still running the same test cases (21,118 test cases total with --features slow-tests-1, and 61,918 test cases total with --all-features)

While this works for me for now, I think a solution within rstest that avoids duplicating the #[tokio::test] boilerplate would be much better and more user friendly. And looking a the still somewhat verbose output of cargo expand even after this fix, I think there are probably more opportunities to reduce the number of lines generated by rstest.

la10736 commented 1 year ago

Thank very much for the analysis: it's super useful catch!!! Unfortunately I din't find a smooth and simple way to fix it... your approach if great but don't scale: I cannot create an implementation for each framework that expose a test attribute and I cannot take these copies in sync :cry:

I've tried a generic approach but I should introduce a global hash table .... I'll need to investigate more.

la10736 commented 1 year ago

I did some experiments in https://github.com/la10736/test_in_test but I need to check if there is a better way

smessmer commented 1 year ago

I haven't looked at how rstest works exactly, and I'm not very familiar with rust attributes either, so my thought might be naive, but at first sight, #[tokio::test] is the innermost attribute to the test function. So technically, shouldn't #[tokio::test] be the first attribute applied to the test function, wrap it into a larger function containing its boilerplate, and then #[rstest] only works on that wrapper function? If done this way, then rstest wouldn't need to know what the exact test attribute is - just let it finish its own expansion, and work on the result. And it wouldn't duplicate anything because it just adds test cases calling the one wrapper function generated by #[tokio::test].

la10736 commented 1 year ago

Ok, you're not so far from how it works but... in the reverse order :smile: . Every procedural macro attribute works on its surrounded code: call the function associated to the attribute with the block's syntax tree that follow the attribute and replace it with the output syntax tree returned. Compiler resolve procedural macros in the order that they are in the code.

In this scenario #[tokio::test] attribute take the block (an async function without any argument) and transform it in a synchronous test function by create the runner and call something like block_on(inner_async_function.await).... that is how almost all test procedural macros works. So, rstest cannot change this behavior and #[tokio::test] will always generate just a test function without any arguments. In https://github.com/la10736/test_in_test the trick is to create a context to resolve it from the inner function in the test generated from #[tokio::test] to work around the lack of arguments.... I need to elaborate it to try to remove the Send constrains because ideally I'm in the same thread so I should store and get it also if it is not Send ... thread_local! should be enough.... My fear here is that I've not too much time to work on it .... but I'll do my best because I think that you've pointed out a really interesting point.

smessmer commented 1 year ago

Oh I see, the argument point makes sense. How did you solve this in today's rstest? Do you replace #[tokio::test] with your own logic that allows parameters?

Are there test generators that might run multiple tests in parallel on the same thread? Looking at tokio, it seems to use a different threat for each test, so the thread_local (or lazy_static with thread_name) approach could work. Might break with #[tokio::test(flavor = "multi_thread")] though if that runs tests on different threads. I guess what you actually want is a task identifier but getting that is likely framework specific to tokio.

la10736 commented 1 year ago

Oh I see, the argument point makes sense. How did you solve this in today's rstest? Do you replace #[tokio::test] with your own logic that allows parameters?

No: For each input combination I generate a new #[tokio::test] function that resolve the inputs and call the original user's test function.

The multithreaded case cannot be an issue: every #[*::test] attribute will generate a standard #[test] attribute that is implemented with a new thread for each test and we should resolve the inputs before start tokio's runtime. Now the lazy_static approach with a global mutexed hash map need that the parameters should be Send and so thy are not general enough ... for instance Future<_> are not Send.... maybe I should Pin them but I should study it before.

smessmer commented 1 year ago

Ah I see, you're not actually processing the #[tokio::test] macro, but you're outputting code that contains it. And you're modifying the user provided test function to add parameters to it.

Looks like your idea should work but it would introduce some overhead. If you're looking for more ideas, here are two that jumped to my mind:

Seems to me that one option would be to eagerly expand the #[tokio::test] macro and then modify it's output using your macro to add parameters instead of modifying the user provided test function. But we may have to wait for https://github.com/rust-lang/rust/issues/90765 to be able to do this.

Another option could be to keep doing what retest is doing today to keep general support of all test macros, but add a special case optimization for known-expensive ones like #[tokio::test]. If we have manual code dealing with one specific test macros, we know what it's supposed to expand to and can just expand it ourselves. Or maybe we can even depend on and just call into tokio's implementation of the macro to expand the token stream.

la10736 commented 1 year ago

Looks like your idea should work but it would introduce some overhead.

From the first (and very partial) benchmark seams very negligible.

Seems to me that one option would be to eagerly expand the #[tokio::test] macro and then modify it's output using your macro to add parameters instead of modifying the user provided test function. But we may have to wait for rust-lang/rust#90765 to be able to do this.

That was exactly what I'm looking for.... but seams stalled :cry:

Another option could be to keep doing what retest is doing today to keep general support of all test macros, but add a special case optimization for known-expensive ones like #[tokio::test]. If we have manual code dealing with one specific test macros, we know what it's supposed to expand to and can just expand it ourselves. Or maybe we can even depend on and just call into tokio's implementation of the macro to expand the token stream.

I thought about it but I'll be my last resource.

smessmer commented 1 year ago

I guess another option could be to fix #[tokio::test] so that it duplicates less boilerplate and instead moves its boilerplate into a shared function

la10736 commented 1 year ago

I did some test in https://github.com/la10736/test_in_test and I fond a good implementation with threadlocal and without any lock. In my cases (100000 very trivial tests) the speedup is 4x but I don't think it is really related to the number of lines introduced from the #[tokio::test] boilerplate but more from type inference that tokio test need to compile. If you take a look to the code in test_in_test repository you can see that I did also some example by move out the context struct instead of replicate it for each case: the difference is negligible. Also for test execution time there is no difference from the standard rstest implementation and new I one that I would implement.

I'll try to implement it in the next days in a branch (I hope tomorrow but I'm not sure). Then you can try it in your codebase.

smessmer commented 1 year ago

Sounds good. Happy to report back with measurements once you have something.

la10736 commented 1 year ago

I was wrong :cry: It's really hard try to encapsulate the state in a generic static object because size can change for every case. I should search something else that put the state in the heap or something with unsafe to recover it from the stack.

ozankabak commented 3 months ago

Any progress on this?

la10736 commented 3 months ago

No, I'm sorry. I've no time to work on it.

ozankabak commented 3 months ago

No worries -- maybe we can help. I'll take a look at the previous work and report back if I get any actionable ideas

la10736 / rstest

Very long build time #184