apache / iceberg

Apache Iceberg
https://iceberg.apache.org/
Apache License 2.0
6k stars 2.1k forks source link

Move JUnit4 tests to JUnit5 #7160

Open nastra opened 1 year ago

nastra commented 1 year ago

Feature Request / Improvement

In our contributing guidelines we point people to writing new unit tests using JUni5.

We should consider/evaluate what the best approach would be to eventually move existing tests from JUnit4 tests to JUnit5.

Query engine

None

mitrofmep commented 1 year ago

Hello! I'd like to help with this issue. Just checked #209. Am I right, that it needs to check all the test classes, where only "import org.junit.*" used, without import jupiter api? For example, like in core/src/test/java/org/apache/iceberg/avro/AvroDataTest.java. And if possible, to refactor them using junit5?

nastra commented 1 year ago

It's worth looking at https://junit.org/junit5/docs/current/user-guide/#migrating-from-junit4. There are a few items to watch out for:

For example in JUnit4 the method signature of assertEquals() was assertEquals(String message, Object expected, Object actual) but in JUnit5 it is assertEquals(Object expected, Object actual, String message). There are also a few more methods affected by this. See https://junit.org/junit5/docs/current/user-guide/#migrating-from-junit4-failure-message-arguments for some details. In those cases I'm not sure what's better (just converting from JUnit4 to JUnit5 assertions or switching assertion checks to AssertJ where it makes sense). Both will cause larger diffs.

For modules that don't use JUnit5 yet, those would require

test {
    useJUnitPlatform()
  }

Things like @Rule public TemporaryFolder temp = new TemporaryFolder(); don't work in JUni5 and would have to be converted to @TempDir Path temp;.

It's worth doing the conversion module-by-module and starting in a module that has a small amount of tests (like iceberg-pig or iceberg-gcp).

@jackye1995 any thoughts on the outlined approach?

jackye1995 commented 1 year ago

Yes we definitely want to convert module by module.

I do not have strong opinion regarding using Junit assertEquals vs AssertJ, it seems like they will incur almost the same amount of changes:

Assertions.assertThat("a").isEqualTo("b").describedAs("xxx");

vs

Assertions.assertEquals("a", "b", "xxx")

AssertJ might have the advantage that it can format the description with .describedAs("xxx %s %s", x, y) so it's a bit easier to use. And it might simplify some test cases. I don't quite get why we want to lazily load a description message with the feature that Junit5 provides. So maybe using AssertJ could be preferred.

nastra commented 1 year ago

I think in a lot of cases we don't necessarily need .describedAs(..) / .as(..) as the context that is being provided when a check fails is usually enough. Taking the below example:

Assert.assertTrue("OSS file should exist", ossClient().get().doesObjectExist(uri.bucket(), uri.key()));
Assert.assertEquals("Should have expected location", location, out.location());
Assert.assertEquals("Should have expected length", dataSize, ossDataLength(uri));
Assert.assertArrayEquals("Should have expected content", data, ossDataContent(uri, dataSize));

which could be rewritten to:

Assertions.assertThat(ossClient().get().doesObjectExist(uri.bucket(), uri.key())).as("OSS file should exist").isTrue();
Assertions.assertThat(out.location()).isEqualTo(location);
Assertions.assertThat(ossDataLength(uri)).isEqualTo(dataSize);
Assertions.assertThat(ossDataContent(uri, dataSize)).isEqualTo(data);

That being said, we should just double-check when transitioning/reviewing where it makes sense to have a failure description / overriding error msg.

It's also worth mentioning that there are scripts available to convert assertions: https://assertj.github.io/doc/#assertj-migration

jackye1995 commented 1 year ago

I just noticed one thing, given the fact that Iceberg prefers to not import static methods directly, we will end up having 2 Assertions classes if we use both Junit5 and AssertJ. Given that we are already using AssertJ, we probably need to stick with that instead of mixing both.

This means we want to migrate to Junit5 for framework and usages of things like tempdir, before, after, etc. but use AssertJ for all assertions. Should we have a thread on devlist just to confirm this as a community guideline?

there are scripts available to convert assertions

That's super cool, maybe we should try it.

reswqa commented 1 year ago

Hi guys,

I am very happy to see this discussion on migrating test framework to Junit5, which will bring great benefits to the testing part of the project. For example, stronger support for parameterized test and better support for lambda expressions.

Since the Apache Flink community has already started migrating tests to Junit5 and Assertj a long time ago, maybe we can learn something from it. In short, It adopted a module-by-module migration solution, and you can refer to this umbrella ticket.

It's worth looking at https://junit.org/junit5/docs/current/user-guide/#migrating-from-junit4.

It is really a good reference to start. At the same time, I would like to share the [migration guidance](JUnit5 migration guide) of the flink community for your reference.

It's also worth mentioning that there are scripts available to convert assertions: https://assertj.github.io/doc/#assertj-migration

This script does most of the boring migration stuff. However, according to my experience, there are still some special cases that it cannot handle, such as converting try{} catch(expected exception) to Assertions.assertThatThrowBy. However, there are not many such exceptions. We can run this script first and spend a little more time to manually handle some special cases.

Things like @Rule public TemporaryFolder temp = new TemporaryFolder(); don't work in JUni5 and would have to be converted to @TempDir Path temp;.

Yes, something like this does the more tedious part of the migration process.

I also want to share another similar case:

Juint5 has much better support for parameterized tests to avoid separating parameterized and non-parameterized parts into different test classes. But this also brings some troubles to our migration, because many of our previous tests were parameterized for the whole class, but now we have to inject parameters for each method. In order to avoid the inconvenience caused by the above problems as much as possible, we have done some work in flink:

For more details, please refer to this ticket.

Of course, I am also very willing to participate in the migration of iceberg's test framework, and I look forward to taking this opportunity to integrate into this great community.

jackye1995 commented 1 year ago

Thank you Weijie for the detailed explanation, this is super helpful! Regarding Junit5 vs AssertJ Assertions, what did Flink community do? Do you think using AssertJ Assertions (which seems to be the consensus above) is preferred over Junit5?

reswqa commented 1 year ago

Thanks @jackye1995 for the quick reply!

Regarding Junit5 vs AssertJ Assertions, what did Flink community do?

Earlier, many different types of assertions were mixed in Flink's codebase, such as Junit4 assertion, Hamcrest,some custom assertion, etc.

While migrating to Junit5, Flink community decided that use AssertJ as the only assertion framework.

Do you think using AssertJ Assertions (which seems to be the consensus above) is preferred over Junit5?

In my personal experience, AssertJ is really powerful and the benefit speaks for itself.

github-actions[bot] commented 9 months ago

This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.

attilakreiner commented 4 weeks ago

Hey Team, I am new here, I was looking for a beginner task and I've seen you are in the process of migrating from Junit 4 to 5.

I just gave it a try and migrated over the tests in the :iceberg-data module. (The :iceberg-flink module also got affected; and a little bit :iceberg-spark and :iceberg-mr too). I believe I have completely removed Junit 4 from the :iceberg-data module.

Pls review this PR: https://github.com/apache/iceberg/pull/10657

I'd highly appreciate any feedback.