bopen / xarray-sentinel

Xarray backend to Copernicus Sentinel-1 satellite data products
Apache License 2.0
222 stars 22 forks source link

Add sample data for tests #19

Closed corrado9999 closed 3 years ago

corrado9999 commented 3 years ago

For testing burst opening, we need real data to pass to rioxarray. Of course using the original TIFF files is not feasible, as they can easily be very big, order of GiBs.

I played a bit to solve this issue and I have a few options:

  1. Put in the repository an all-zeros TIFF file with the same geometric properties of the original ones. By using the compression facility of TIFFs (method ZSTD gave the best result), I cut down 1.1GiB to 384KiB.
  2. Same as before, but changing the block size to the whole image (the rational is that compression is performed at block level). In this way the same TIFF as before reduce to 344KiB.
  3. Add setup code to tests that generates the needed TIFF on-the-fly. The code would be very simple, we could put the destination file name under .gitignore to avoid mistakes and we could also "cache" the produced file to avoid generating it repeatedly.

Personally I think the second option is not worth losing the original blocksize, while the third option is more code to maintain and it slightly complicates the tests. Nonetheless, the third option could be useful should we need more meainingful data to put in, for example for calibration.

alexamici commented 3 years ago
  1. look good to me, you can use a fixed values instead of 0 to test the calibration.
alexamici commented 3 years ago

Another option that is good for me:

  1. only test data access after manual download of the full products (mark data related tests as xfail)
corrado9999 commented 3 years ago

Option 4 is what it currently implemented. The drawback is that, if you make some changes that make the test fail, it will pass by unnoticed. @aurghs also told me that the incoming changes of update-README.md branch will make pytest --doctest in GitHub actions fail if the burst data are not available, as there is an example in the README.md showing how to access a burst.

Thus, I am opening a branch to upload sample data files and remove the xfail mark.