ERDDAP / erddap

ERDDAP is a scientific data server that gives users a simple, consistent way to download subsets of gridded and tabular scientific datasets in common file formats and make graphs and maps. ERDDAP is a Free and Open Source (Apache and Apache-like) Java Servlet from NOAA NMFS SWFSC Environmental Research Division (ERD).
Creative Commons Zero v1.0 Universal
76 stars 54 forks source link

Migrate tests to JUnit #142

Closed ChrisJohnNOAA closed 2 months ago

ChrisJohnNOAA commented 3 months ago

Description

This migrates tests to JUnit along with updates to file paths to make them runnable with minimal setup for a new developer.

This also loads test resources through maven by downloading from a github release rather than including the files in the repo. There were some files included previously those have been removed. The reasoning is a lot of the test data files are >10 mb and in total the test data is several gb of data. While downloading the files aren't great, I think its a better solution than including them in the main repo.

About 400 tests will now run with mvn test. On my computer(s) the first run takes a while, but once things are cached appropriately, future runs are around 30 minutes.

Fixes # (issue)

Type of change

Please delete options that are not relevant.

Checklist before requesting a review

srstsavage commented 2 months ago

Is any setup besides getting netcdfAll-5.5.3.jar needed before running mvn test? I get the following:

[ERROR] Tests run: 384, Failures: 0, Errors: 211, Skipped: 0 

Haven't dug in at all yet but many of the errors seem to be around tests taking a bit longer than the specific threshold, or the erddapContentDirectory not being properly set up, or failure to initialize various classes in this context (gov.noaa.pfel.erddap.util.EDStatic, gov.noaa.pfel.erddap.dataset.EDDTableFromFiles, etc). Or maybe it's expected that many of the tests will fail at this point?

ChrisJohnNOAA commented 2 months ago

Is any setup besides getting netcdfAll-5.5.3.jar needed before running mvn test? I get the following:

[ERROR] Tests run: 384, Failures: 0, Errors: 211, Skipped: 0 

Haven't dug in at all yet but many of the errors seem to be around tests taking a bit longer than the specific threshold, or the erddapContentDirectory not being properly set up, or failure to initialize various classes in this context (gov.noaa.pfel.erddap.util.EDStatic, gov.noaa.pfel.erddap.dataset.EDDTableFromFiles, etc). Or maybe it's expected that many of the tests will fail at this point?

All of the tests that are currently enabled are passing on my machine. There are several hundred more tests that are disabled with various Tags.

I tried to have maven handle getting everything (including netcdf) so there should be minimal setup.

My guess with that number of errors is the src/test/resources directory didn't get setup properly. Maven should be downloading several large zip files from here: https://github.com/ERDDAP/erddapTest/releases/tag/rc-test and then unzipping them to src/test/resources. The download and unzipping are specified in the pom. The contents of the src/test/resources should be copied by maven (default behavior so not in the pom) to target/test-classes. So for example there should be target/test-classes/data, target/test-classes/largeFiles, target/test-classes/largePoints, target/test-classes/largeSatellite along with all of the directories from compiling the code/tests.

The other possibility (because you mentioned errors about erddapContentDirectory ) is the download of the content.zip (and unzipping of it) might have failed.

If it's neither of those, the earliest error about failing to initialize a class is generally the most helpful. So could you share that?

What OS are you on? Did any maven steps fail before running the tests?

srstsavage commented 2 months ago

OS is Linux (Debian 12), Maven 3.6.3, Java 17.0.3. It looks like some of the main problems are due to usage of the Windows specific backslash \ file separators in two places. I opened a PR for that here: https://github.com/ChrisJohnNOAA/erddap/pull/5. If that causes issues on Windows we'll want to use Java's File.separator but I think forward slashes might be handed fine by Java even on Windows.

The other issue I'm seeing is that the values in the default setup.xml aren't usable in the test context. For example, in ./content/erddap/setup.xml by default bigParentDirectory is set to /home/yourName/erddap/, which causes an attempt to create directories under /home/yourName/erddap/ which fails because that directory doesn't exist. Similar things happen with failed connection attempts to your.smtp.host.edu etc.

In the draft/example PR #116 I showed one example of how we've dealt with the setup.xml values in the past (by creating a second copy and updating the example values with values usable in a testing context, example https://github.com/ERDDAP/erddap/pull/116/files#diff-9c5fb3d1b7e3b0f54bc5c4182965c4fe1f9023d449017cece3005d3f90e8e4d8R324). Is the stock setup.xml working for you locally on Windows, or are you pointing to a curated setup.xml, or some other approach?

ChrisJohnNOAA commented 2 months ago

I'm running the tests now with your change to the file path. I'll see how it does. I think there were other uses of backslash than the places you identified, so we may need additional changes for that on linux, but it will be hard for me to find those.

As for setup, those values weren't causing test to break for me, but I definitely agree they're not great. As part of the Jetty changes, I've discussed with Micah adding a local development version of setup.xml that we can use for tests and Jetty. I'd prefer to go that approach rather than editing the xml in the pom file.

There's still a lot of spam in the logs with these tests to get cleaned up. As far as I can tell the connection errors for your.smtp.host.edu is just noise and doesn't cause any test failures. If you're seeing tests fail from it, let me know which ones. That said I don't think we want to actually send emails in the tests. I've considered disabling the email thread in testing (https://github.com/ChrisJohnNOAA/erddap/blob/test_migrations/WEB-INF/classes/gov/noaa/pfel/erddap/util/EDStatic.java#L1873 ) which I believe stops the error noise in the logs. Thoughts?

srstsavage commented 2 months ago

As for setup, those values weren't causing test to break for me, but I definitely agree they're not great. As part of the Jetty changes, I've discussed with Micah adding a local development version of setup.xml that we can use for tests and Jetty. I'd prefer to go that approach rather than editing the xml in the pom file.

That sounds good to me, we'll just need to make sure that the development setup.xml stays in line with the example version if new parameters are added etc. That's one benefit of doing dynamic value replacement on the example setup.xml, but it's also more complex/brittle.

There's still a lot of spam in the logs with these tests to get cleaned up. As far as I can tell the connection errors for your.smtp.host.edu is just noise and doesn't cause any test failures. If you're seeing tests fail from it, let me know which ones. That said I don't think we want to actually send emails in the tests. I've considered disabling the email thread in testing (https://github.com/ChrisJohnNOAA/erddap/blob/test_migrations/WEB-INF/classes/gov/noaa/pfel/erddap/util/EDStatic.java#L1873 ) which I believe stops the error noise in the logs. Thoughts?

Ah, that makes sense, I just saw the smtp messages and assumed they were more than just warnings. Current test log output is about 312M so there's a lot of content! :+1: to your suggestion to disable email sending during testing.

With the file separator fixes I'm down to 70 test failures. Looking through them now. Many appear to be string hashing differences between Windows and Linux, likely due to line endings as noted in #140, so hopefully there's a simple fix for that.

Many other tests are failing due to performance/execution time thresholds. Any thoughts on how to handle those? I think we'd still want to run the tests measuring performance but only fail on exceeded threshold if a particular config was set, maybe something as simple as a provided system property (mvn test -DcheckPerformance=true or mvn test -DtestPerformanceMultiplier=2).

At least one test is failing due to case sensitivity in a test filename. I expect we'll probably see more.

-------------------------------------------------------------------------------
Test set: gov.noaa.pfel.coastwatch.griddata.GridTests
-------------------------------------------------------------------------------
Tests run: 14, Failures: 0, Errors: 9, Skipped: 0, Time elapsed: 2.022 s <<< FAILURE! -- in gov.noaa.pfel.coastwatch.griddata.GridTests
gov.noaa.pfel.coastwatch.griddata.GridTests.testHDF -- Time elapsed: 0.288 s <<< ERROR!
java.io.FileNotFoundException: /src/erddap/target/test-classes/data/gridTests/OQNux10S1day_20050712_x-135_X-105_y22_Y50.grd (No such file or directory)
$ egrep -R "OQNux10S1day_20050712_x-135_X-105_y22_Y50.grd" src/test/java
src/test/java/gov/noaa/pfel/coastwatch/sgt/SgtMapTests.java:    gridGrid.readGrd(griddataDir + "OQNux10S1day_20050712_x-135_X-105_y22_Y50.grd",
src/test/java/gov/noaa/pfel/coastwatch/sgt/SgtMapTests.java:    contourGrid.readGrd(griddataDir + "OQNux10S1day_20050712_x-135_X-105_y22_Y50.grd",
$ unzip -l download_cache/data.zip_6208d3172b23be83ed2ca47f22cbf49c | egrep -i "OQNux10S1day_20050712_*"
    95865  2023-05-26 14:13   data/gridTests/OQNux10S1day_20050712_x-135_X-105_y22_Y50.asc
    49551  2023-05-26 14:13   data/gridTests/OQNux10S1day_20050712_x-135_X-105_y22_y50.gif
    55320  2023-05-26 14:13   data/gridTests/OQNux10S1day_20050712_x-135_X-105_y22_y50.grd
   118599  2023-05-26 14:13   data/gridTests/OQNux10S1day_20050712_x-135_X-105_y22_y50.hdf
    62144  2024-02-14 15:34   data/gridTests/OQNux10S1day_20050712_x-135_X-105_y22_Y50.nc
   258033  2023-05-26 14:13   data/gridTests/OQNux10S1day_20050712_x-135_X-105_y22_y50.xyz
$ unzip -l download_cache/data.zip_6208d3172b23be83ed2ca47f22cbf49c | egrep -i "OQNux10S1day_20050712_x-135_X-105_y22_Y50.grd"
    55320  2023-05-26 14:13   data/gridTests/OQNux10S1day_20050712_x-135_X-105_y22_y50.grd

So the test is looking for OQNux10S1day_20050712_x-135_X-105_y22_Y50.grd, but the file is actually named OQNux10S1day_20050712_x-135_X-105_y22_y50.grd (last y is lowercase). How do you want to handle these cases? If we want to update the filename in the test I can send PRs to this branch. If we want to update the filename of the test file we'd need to do so in the release downloads on https://github.com/ERDDAP/erddapTest/, but I'm not seeing the OQNux10S1day_20050712* files in that repo (could be missing them!).

ChrisJohnNOAA commented 2 months ago

For the performance thresholds, I'm not sure. In many cases the performance check is the only test/assert in the code. Perhaps the best course of action is to tag performance tests and setup a mvn command that will test them. The values are tuned to Bob's machine so they likely need some other changes to make them currently useful (other than verifying we run through the performance tests without throwing exceptions).

For the hashing/line ending errors, could you send me a few examples so I can investigate a fix that will work for them all?

For the capitalization differences, the easiest thing is likely to update the filename in the test. The test resources that are being downloaded are not a packaging of the erddapTest repo. Some of the files are from there, but there were many tests relying on data not in that repo. The test resources getting downloaded are a new packaging of the data needed for tests to try to get as many of the existing tests working as possible.

srstsavage commented 2 months ago

For the hash mismatches in HashDigestTests.basicTest, the culprit is actually an absolute path, not line endings.

https://github.com/ERDDAP/erddap/blob/main/src/test/java/com/cohort/util/HashDigestTests.java#L10

String tName = HashDigestTests.class.getResource("LICENSE.txt").getPath();

tName is set to an absolute path (/home/shane/src/erddap/target/classes/com/cohort/util/LICENSE.txt on my machine) which of course isn't portable. Seems like we should either make that path relative to the project root or not use a path at all. Thoughts?

ChrisJohnNOAA commented 2 months ago

For the hash mismatches in HashDigestTests.basicTest, the culprit is actually an absolute path, not line endings.

https://github.com/ERDDAP/erddap/blob/main/src/test/java/com/cohort/util/HashDigestTests.java#L10

String tName = HashDigestTests.class.getResource("LICENSE.txt").getPath();

tName is set to an absolute path (/home/shane/src/erddap/target/classes/com/cohort/util/LICENSE.txt on my machine) which of course isn't portable. Seems like we should either make that path relative to the project root or not use a path at all. Thoughts?

I think that's actually hashing the file contents of the file at that path. The path comes from HashDigestTests.class.getResource("LICENSE.txt").getPath(); which turns a path relative to the resource directory into an absolute path.

It does also look like the resource path didn't get updated when I moved it to resources/data/LICENSE.txt I am shocked that didn't cause an error for me.

srstsavage commented 2 months ago

Ah! Ok, that makes sense. The previous HashDigestTests.class.getResource("LICENSE.txt").getPath(); was finding target/classes/com/cohort/util/LICENSE.txt (aka WEB-INF/classes/com/cohort/util/LICENSE.txt) for me, VS HashDigestTests.class.getResource("/data/LICENSE.txt").getPath(); which resolves to target/test-classes/data/LICENSE.txt (aka src/test/resources/data/LICENSE.txt). The only difference between those two files is...line endings.

$ diff --strip-trailing-cr WEB-INF/classes/com/cohort/util/LICENSE.txt src/test/resources/data/LICENSE.txt
# empty output/no difference
$ diff WEB-INF/classes/com/cohort/util/LICENSE.txt src/test/resources/data/LICENSE.txt
# every line differs due to line endings

Ok great, I'll work on a PR fixing the case sensitivity issues and we'll see where we are...

srstsavage commented 2 months ago

Well, it looks like the case of the test file OQNux10S1day_20050712_x-135_X-105_y22_y50.grd is meaningfully wrong as the second _Y is supposed to indicate the max value of the y range, so it seems like the test file should be changed in the test data download.

https://github.com/ChrisJohnNOAA/erddap/blob/main/WEB-INF/classes/gov/noaa/pfel/coastwatch/griddata/FileNameUtility.java#L732-L739

    /**
     * This returns minY (e.g., 22).
     *
     * @param fileName a base (assumes full region) or custom file name.
     * @return minY.
     */
    public double getMinY(String fileName) {
        int poy = fileName.indexOf("_y");
        if (poy < 0) 
            return regionMinY;

        int poY = fileName.indexOf("_Y", poy + 2);
        return String2.parseDouble(fileName.substring(poy + 2, poY));
    }

Simply updating the testName path in GridTests results in a string index out of bounds error because the _Y can't be found in the filename by FileNameUtility.

srstsavage commented 2 months ago

Spent quite a bit of time over the weekend reviewing test failures on Linux. A PR with fixes for many of them are here:

https://github.com/ChrisJohnNOAA/erddap/pull/7

Most of the remaining failing tests (~45 in total) fall into two categories:

Performance tests running slower than expected

This has been discussed, but the performance specifics in the test are non-portable and should be able to be disabled and/or adjusted during testing. The tests should be runnable on a minimal CI virtual machine, etc.

com.cohort.util.TestUtil.testString2canonicalStringHolder
com.cohort.util.TestUtil.testString2
gov.noaa.pfel.coastwatch.pointdata.TableTests.testBigAscii
gov.noaa.pfel.erddap.dataset.EDDTableFromFileNamesTests.testgoes17all

Differences in generated images due to font rendering

Haven't dug much into this one yet. It's possible I don't have my Linux environment set up completely correctly with regard to the desired fonts, but I believe I do. Also definitely possible that text rending in images is just a little different between Windows and Linux.

gov.noaa.pfel.coastwatch.sgt.SgtGraphTests.testSurface_ff
gov.noaa.pfel.coastwatch.sgt.SgtGraphTests.testSurface_ft
gov.noaa.pfel.coastwatch.sgt.SgtGraphTests.testSurface_tt
gov.noaa.pfel.coastwatch.sgt.SgtGraphTests.testDiverseGraphs_tff
gov.noaa.pfel.coastwatch.sgt.SgtGraphTests.testDiverseGraphs_tft
gov.noaa.pfel.coastwatch.sgt.SgtGraphTests.testDiverseGraphs_ttt
gov.noaa.pfel.coastwatch.sgt.SgtMapTests.testOceanPalette
gov.noaa.pfel.coastwatch.sgt.SgtMapTests.basicTest
gov.noaa.pfel.coastwatch.sgt.SgtMapTests.testTopography
gov.noaa.pfel.coastwatch.sgt.SgtMapTests.testBathymetry
gov.noaa.pfel.coastwatch.sgt.SgtMapTests.testX180To180Regions
gov.noaa.pfel.coastwatch.sgt.SgtMapTests.testX0To360Regions
gov.noaa.pfel.erddap.dataset.EDDGridFromDapTests.testMapAntialiasing
gov.noaa.pfel.erddap.dataset.EDDGridFromEtopoTests.testBasic
gov.noaa.pfel.erddap.dataset.EDDGridFromNcFilesTests.testUInt16File
gov.noaa.pfel.erddap.dataset.EDDGridFromNcFilesTests.testSimpleTestNc
gov.noaa.pfel.erddap.dataset.EDDGridFromNcFilesUnpackedTests.testUInt16File
gov.noaa.pfel.erddap.dataset.EDDGridFromNcFilesUnpackedTests.testMissingValue
gov.noaa.pfel.erddap.dataset.EDDGridLon0360Tests.testPM180
gov.noaa.pfel.erddap.dataset.EDDGridLon0360Tests.testInsert
gov.noaa.pfel.erddap.dataset.EDDGridLonPM180Tests.test120to320
gov.noaa.pfel.erddap.dataset.EDDTableFromAsciiFilesTests.testFixedValueAndScripts
gov.noaa.pfel.erddap.dataset.EDDTableFromNcFilesTests.testTimeAxis
gov.noaa.pfel.erddap.dataset.EDDTableFromNcFilesTests.testManyYears

The good news is that all of the image checking is done in a single utility method com.cohort.util.Image2Tests.testImagesIdentical, so it would be very easy to disable or adjust parameters in one place.

If we add an option to disable, I think it should only disable the image comparison part and still run the image generation code, as that's still very valuable to ensure working.

One example set of expected/observed/difference images:

Expected: SgtGraphTestSurfaceY_exp

Observed: SgtGraphTestSurfaceY_obs

Difference: SgtGraphTestSurfaceY_diff

Other than that, there are ~10 other test failures which seem due to various differences, I'll try to chase those down later, but I probably won't be able to give them much attention next week.

Not sure what the goal is for Linux compatibility on these tests at this point. I don't think it needs to block merging, we can definitely fix tests in separate PRs later.

Oh and one other note that's already been discussed but I want to emphasize: the default values in content/erddap/setup.xml requiring /home/yourName and /data/httpGet to exist and be writing are pretty painful. The sooner those can be configurable and/or local to the project directory or /tmp the better.

ChrisJohnNOAA commented 2 months ago

I believe there's around 40 image comparison tests currently so it's quite possible that's the vast majority of the differences left.

For anybody making changes that might impact the generated images, I'd like them to be able to run the tests with a before/after comparison. I actually set up the code to automatically create the expected files from the observed (if the expected files are missing).

https://github.com/ChrisJohnNOAA/erddap/blob/test_migrations/src/test/java/com/cohort/util/Image2Tests.java#L246-L254

We could have the expected images be a separate optional pack and just have the expected images be locally generated with instructions to generate the images before making changes to confirm what impact changes have. If you want to test this locally, delete src/test/resources/data/images and target/test-classes/data/images.

I do want the tests to be runnable on Linux, both to lower barriers to others contributing, but also because many people run ERDDAP on Linux servers, so it is good to be confident in that environment as well. I don't have a linux machine available to me currently though so it's something I'll need assistance with.

As for the setup.xml issues. Micah's pull https://github.com/ERDDAP/erddap/pull/147 should help with that. It provides a setup.xml intended to work for local development/tests. I'll make sure it works well with all the tests once it is merged.

ChrisJohnNOAA commented 2 months ago

@srstsavage I've uploaded a new version of data.zip. The normal process would be to have a new version of the data which the download/cache system should update automatically. For right now though you'll need to delete your download_cache/data.zip, src/test/resources/data, and target/test-classes/data files.

ERDDAP assumes linux machines have csh installed. If you do not you likely have a few errors due to that missing.

And I disabled a couple tests for now due to different behavior on Linux/Windows which I'll need to investigate more.

This gets all of the tests passing in my virtual linux machine (I talked to IT and got VirtualBox installed).

ChrisJohnNOAA commented 2 months ago

@srstsavage I marked this as ready for review. While debugging some of the differences I had between my Windows and Linux (VirtualBox) machines, I realized that because of how many of the tests are constructed there are quite possibly errors for others. It comes down to a lot of the tests being based on comparing the dataset's attributes to a hardcoded set of values. However the attributes for a dataset are in many cases based on which file has the first or last (depending on settings) file modification timestamp. On a windows machine a copied (or file extracted from a zip) will retain it's original modification timestamp (but have a creation timestamp when it was copied/extracted). On Linux (possibly exact distribution dependent) when Java asks for the last modified timestamp of a file, the (OS) provided value will not be earlier than the file creation. What this means is which file is used for the metadata (and therefore attributes the test is comparing against) can be effectively random because it's now based on the order the files were extracted from the zip.

It would likely be good to move away from tests that can break if files are extracted from the zip in a different order, but that is outside of the scope of this pull. I'm also open to changes to how dataset metadata is extracted from files if that's appropriate (but also outside the scope of this change).

So that's a long way of saying, I think this pull is good enough. If there are tests that are failing on a clean build on other machines, I'll fix them, but I don't currently know of any.

srstsavage commented 2 months ago

Ah thanks, that makes sense about the zip extraction. I'll give this another full test run and review tonight.

srstsavage commented 2 months ago

Added another small PR here to re-add maven-antrun-plugin execution to the pom that creates the bigParentDirectory data referenced in development/jetty/config/setup.xml and comment out a few more non-portable performance tests.

I still have ~10 failing tests but I believe they're mostly due to the indeterminate zip extraction issue you mentioned. I agree that the few remaining failures don't need to get sorted out here and can be addressed in more targeted PRs.