discussion: Refactor our integration tests to make it more scalable.

liurenjie1024 commented 1 week ago

Background

Currenlty in our integration tests, for each test(each function), we will use docker compose to start a set of containers and stop them after tests done, see this line for an example. The advantage of this approach is that it provides perfect isolation for each test. But #420 shows that this may not be scalable on systems like macos.

Proposal

Refactor our tests to what's in pyiceberg, e.g. we use scripts to start containers outside tests, tests only cares about actual logic.

The pros of this approach is that it scales better and supposed to run faster.

The cons of this approach is that developers need to be careful with namespace issues for each test, also simple cargo test no longer works well.

Alternatives

There are some approaches to limit the concurrency of tests, for example RUST_TEST_THREADS, --tests-threads, etc. But they make the tests slower to execute.

liurenjie1024 commented 1 week ago

cc @Fokko @sdd @Xuanwo @ZENOTME @marvinlanhenke

Xuanwo commented 1 week ago

The cons of this approach is that developers need to be careful with namespace issues for each test, also simple cargo test no longer works well.

I feel like we can implement our test harness by using libtest_mimic so cargo test can keep working.

Here is an example from opendal:

https://github.com/apache/opendal/blob/main/core/tests/behavior/main.rs#L54

fn main() -> anyhow::Result<()> {
    let args = Arguments::from_args();

    let mut tests = Vec::new();

    async_read::tests(&op, &mut tests);

    ...

    let conclusion = libtest_mimic::run(&args, tests);

    conclusion.exit()
}

liurenjie1024 commented 1 week ago

The cons of this approach is that developers need to be careful with namespace issues for each test, also simple cargo test no longer works well.

I feel like we can implement our test harness by using libtest_mimic so cargo test can keep working.

Here is an example from opendal:

https://github.com/apache/opendal/blob/main/core/tests/behavior/main.rs#L54
fn main() -> anyhow::Result<()> {
    let args = Arguments::from_args();

    let mut tests = Vec::new();

    async_read::tests(&op, &mut tests);

    ...

    let conclusion = libtest_mimic::run(&args, tests);

    conclusion.exit()
}

Sounds interesting to me. The only disadvantage of this approach I can see is that user could no longer run single test in ide, which may make developer more difficult to debug, but others are good to me.

Xuanwo commented 1 week ago

The only disadvantage of this approach I can see is that user could no longer run single test in ide,

IDE can still run tests but need some configuration. The configuration is a bit complex that I prefer to encourage deveopers to use cargo test <test_name> instead directly. This does hurt our develop experience.

thexiay commented 1 week ago

maybe custom_test_frameworks helps a lot.

liurenjie1024 commented 6 days ago

maybe custom_test_frameworks helps a lot.

It requires unstable rust, we may try it in future when it's in stable.

thexiay commented 6 days ago

Maybe there is a way to balance the development experience and improve the performance of preparing container environment.

E.g. testcontainers, factory.rs, you can init docker compose once and reuse it in multiple test case.

use OnceLock<Mutex<Weak<DockerCompose>>> to slove problem, re-initialize it or reuse it when you get it, and return Arc<DockerCompose>, when all reference of DockerCompose is released, the Drop of DockerCompose will be called automatic.

@liurenjie1024

liurenjie1024 commented 6 days ago

Maybe there is a way to balance the development experience and improve the performance of preparing container environment.

E.g. testcontainers, factory.rs, you can init docker compose once and reuse it in multiple test case.

use OnceLock<Mutex<Weak<DockerCompose>>> to slove problem, re-initialize it or reuse it when you get it, and return Arc<DockerCompose>, when all reference of DockerCompose is released, the Drop of DockerCompose will be called automatic.

@liurenjie1024

Thanks @thexiay . The problem with this approach is that, it's only limited to one process. Rust treats each integration test file as a separate crate, e.g. each integration test file will be executed in one process, and will start a set of containers. This may not be a limitation for now, since most of our integration tests only have one file. But in future if we have more complex test, we may still need to refactor to adopt @Xuanwo 's solution.

The advantage of this approach is that, it still have good developer experience.

cc @Xuanwo @Fokko What do you think?

ZENOTME commented 6 days ago

Thanks @thexiay . The solution looks good to me.

This may not be a limitation for now, since most of our integration tests only have one file. But in future if we have more complex test, we may still need to refactor to adopt @Xuanwo 's solution.

I think it's still possible to share a var within multiple test files for the integration test within one crate.

liurenjie1024 commented 6 days ago

I think it's still possible to share a var within multiple test files for the integration test within one crate.

If I understand correctly, it's not possible because each integration test file will be compiled into a standalone binary, e.g. they have different global variables.

Personally I'm in favor @thexiay 's solution since currently it doesn't limit us much. We may adopt @Xuanwo 's solution when necessary.

Xuanwo commented 6 days ago

Personally I'm in favor @thexiay 's solution since currently it doesn't limit us much. We may adopt @Xuanwo 's solution when necessary.

Well, I'm not a fan of running integration tests in IDE, so I don't care this part very much :laughing:

I'm willing to help migrate our our test harness if decided.

liurenjie1024 commented 6 days ago

It seems that currently all our crates's integration tests have only one file, so how about we get started with @thexiay 's simpler approach, and do the migration when necessary? cc @Xuanwo What do you think?

thexiay commented 6 days ago

It seems that currently all our crates's integration tests have only one file, so how about we get started with @thexiay 's simpler approach, and do the migration when necessary? cc @Xuanwo What do you think?

If you guys certain it, i can finish this approach and integrate it into test_utils.

Xuanwo commented 6 days ago

It seems that currently all our crates's integration tests have only one file, so how about we get started with @thexiay 's simpler approach, and do the migration when necessary? cc @Xuanwo What do you think?

That's fine with me. We can always improve this part as needed. So please feel free to dive in, @thexiay.

liurenjie1024 commented 5 days ago

Cool, let move!

apache / iceberg-rust