senbox-org / snap-gpt-tests

GPT tests that will be included in the SNAP testing platform.
5 stars 4 forks source link

Where to download test data? #32

Open milechin opened 1 year ago

milechin commented 1 year ago

I would like to use this gpt tests framework to test an install of gpt on our cluster. I am unable to find the test data. At the bottom of the README file there is reference to an S3 bucket that contains the data. The section indicates I need to go to the "Confluence page" : https://senbox.atlassian.net/wiki/spaces/SENBOX/pages/2490433537/S3+bucket

But I do not have permission to view this page.

Where can I get a copy of this test data?

TomBlock commented 1 year ago

Dear Dennis,

these data are thought as to be used only for internal testing. We will discuss in the team, if we want/can make the data publicly available. The set-up of the GPT test suite is a rather complex task and the dataset covers ~600GB, containing also data that is not free-of-charge to use.

You can safely assume that our team ran the complete test-suite before shipping, so that for your cluster it would probably be sufficient to run a simpler test with one or two nodes just to test the installation.

Cheers, Tom

milechin commented 1 year ago

Hello Tom,

Thank you for the prompt response. Do you have a recommendation for a simpler test? I am not the primary user for this software, so a predefined test with included test data would be most helpful for me. We periodically update the operating system for our cluster and so we need to have a test on file that we can run to confirm the software is still operational for our researchers.

Thank you for the help.

Dennis

TomBlock commented 1 year ago

Hi Dennis,

without knowing which data is being processed on the cluster it is difficult to judge - even more difficult to generate a test-graph.

Probably the best advice I can give is to ask the scientists about a processing graph they use (can be a simple re-projection) and take the confirmed result as your reference dataset. Then you can run the graph everytime the cluster-environment changed and run a comparision with the reference dataset (use band-maths to subtract the same variable from reference and actual - if the mean-value of the difference is >0 something is wrong).

Cheers, Tom

TomBlock commented 1 year ago

More information and examples can be found here:

https://senbox.atlassian.net/wiki/spaces/SNAP/pages/70503475/Bulk+Processing+with+GPT

milechin commented 1 year ago

Hi Tom,

Thank you for the additional information. I ran the example provided and I was able to determine that the GDAL package that came with the software is not compatible with our system. So this was a good example to run.

Do you know if there are other external tools, like GDAL, used by the program that should be tested?

Thank you, Dennis

TomBlock commented 1 year ago

Hi Dennis,

SNAP uses some external native libraries. These are extracted on startup to "userhome"/.snap/auxdata Namely:

Also (if using the s1tbx): jblas (requires libgfortran5)

I hope that's all ...

Cheers, Tom