mrpowers-io / spark-fast-tests

Apache Spark testing helpers (dependency free & works with Scalatest, uTest, and MUnit)
https://mrpowers-io.github.io/spark-fast-tests/
MIT License
436 stars 77 forks source link

Can this library be used with Java ? #11

Open jdk2588 opened 7 years ago

jdk2588 commented 7 years ago

I see the examples are with Scala, can this library be used with Java ?

MrPowers commented 7 years ago

Thanks for the great question @jdk2588 😄

It looks like it's possible to run a Scala JAR file in Java, but I don't know because I've never used Java.

You can download the latest JAR file here if you'd like to give it a shot. Let me know what you find!

jdk2588 commented 7 years ago

Question should be "Is the library tested with Java" ?, they are from JVM family so it can used.

MrPowers commented 7 years ago

Unfortunately, the library is not tested with Java 😢

I'd add the tests, but I unfortunately don't know Java. Sorry about that.

If you are able to test the methods with Java, let me know and I'll be happy to merge any code with master 😄

MrPowers commented 5 years ago

If anyone is using this library with Java, please let me know how it is going for you! Adding a help wanted tag!

gregbrowndev commented 4 years ago

@MrPowers I recently picked up your Testing Spark Applications book. Unfortunately, my company insist on using Java to write our Spark apps. So will let you know if this library works out!

This ticket is a couple of years old. Do you know if anyone has had any success in Java?

almogtavor commented 3 years ago

Any new about this? A support for JUnit would be great

aggubin commented 2 years ago

Hi,

better late then never :)

I just started using spark-fast-tests in Java project at work

Steps:

  1. maven dependency

    <dependency>
        <groupId>com.github.mrpowers</groupId>
        <artifactId>spark-fast-tests_2.12</artifactId>
        <version>1.2.0</version>
                     <scope>test</scope>
    </dependency>
  2. public class MySparkTest implements DatasetComparer

  3. write the test

  4. assertSmallDatasetEquality(actual, expected, false, false, true, 10); since Java has no default params, you gotta set ignoreNullable, ignoreColumnNames, orderedComparison, truncate I've set them to defaults (from Scala) except for truncate, as my Datasets are small indeed

  5. I haven't used other asserts yet, might update when I have ...

Cheers, Alexander

almogtavor commented 2 years ago

@aggubin can you add the code sample?

aggubin commented 2 years ago

@almogtavor see (2) and (4) above those are your code samples. (3) Create you test Dataset from CSV or TXT or String, compare it to the actual Dataset from your method under test

MrPowers commented 2 years ago

@aggubin - this is great! Thank you!!!

Any chance you can send me a PR with README instructions for Java users? Adding a little example to the JavaSpark example project would be awesome too. There are a lot of users that would appreciate this info!

aggubin commented 2 years ago

Hi Matthew,

not sure what do you mean by PR, but here is .zip with sample code

two questions for you:

  1. why you asserts have "actual" first and "expected" second, whereas junit asserts are "expected" then "actual"?

  2. "assertSmallDatasetEquality()" shows only first two columns when test fails, is there a way to print all columns, like DF.show(false)?

Cheers, Alexander

On Tue, Apr 26, 2022 at 4:55 AM Matthew Powers @.***> wrote:

@aggubin https://github.com/aggubin - this is great! Thank you!!!

Any chance you can send me a PR with README instructions for Java users? Adding a little example to the JavaSpark example project https://github.com/MrPowers/JavaSpark would be awesome too. There are a lot of users that would appreciate this info!

— Reply to this email directly, view it on GitHub https://github.com/MrPowers/spark-fast-tests/issues/11#issuecomment-1109703450, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASE5WVA7JKTH4WJKKM736DVG7KT5ANCNFSM4D5OJKRQ . You are receiving this because you were mentioned.Message ID: @.***>

MrPowers commented 2 years ago

@aggubin - here are responses:

not sure what do you mean by PR, but here is .zip with sample code

A PR is a "pull request"

why you asserts have "actual" first and "expected" second, whereas junit asserts are "expected" then "actual"?

Some test frameworks have actual first then expected second. The junit syntax wasn't considered when building this library.

  1. "assertSmallDatasetEquality()" shows only first two columns when test fails, is there a way to print all columns, like DF.show(false)?

Feel free to open up a separate issue to discuss the output of assertSmallDatasetEquality in more detail. For purposes of this discussion, we're focusing on adding documentation for Java users. Changing the output of the lib would be a separate conversation.

Thanks for the questions.

aggubin commented 2 years ago

figured that "smallDataset" in "assertSmallDatasetEquality" is rather "narrow" dataset - 1-2 columns.

Using "assertApproximateDataFrameEquality" now to see row diffs:

assertApproximateDataFrameEquality(actual, expected, 1.0, false, false, false);

that produces OK-looking df that can be inspected

aggubin commented 2 years ago

my responses inline:

@aggubin https://github.com/aggubin - here are responses:

not sure what do you mean by PR, but here is .zip with sample code

A PR is a "pull request"

I probably won't make a pull request just for Readme.md, feel free to use my original reply as the Readme, it has all the steps and the code I've sent for your Java subproject

why you asserts have "actual" first and "expected" second, whereas junit asserts are "expected" then "actual"?

Some test frameworks have actual first then expected second. The junit syntax wasn't considered when building this library.

ok

  1. "assertSmallDatasetEquality()" shows only first two columns when test fails, is there a way to print all columns, like DF.show(false)?

Feel free to open up a separate issue to discuss the output of assertSmallDatasetEquality in more detail.

ok

For purposes of this discussion, we're focusing on adding documentation for Java users. Changing the output of the lib would be a separate conversation.

Thanks for the questions.

— Reply to this email directly, view it on GitHub https://github.com/MrPowers/spark-fast-tests/issues/11#issuecomment-1111056551, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASE5WVDZHUFZZPSDS4OVB3VHFDVRANCNFSM4D5OJKRQ . You are receiving this because you were mentioned.Message ID: @.***>