Open niclasmattsson opened 3 years ago
Ha I meant that that this is more likely a OSGeo/gdal issue, in that there is unlikely to be any Julia wrapping code that behaves differently somewhere between 1000 and 2920 bands.
It might be some work, but perhaps you can reproduce this with generated data, such that people don't need to download such a large raster? Then you can also see if it also happens if you have say 2 rows and 2 columns. And one way to verify is to write this test in Python, to see if it occurs there as well. GDAL devs are more likely to be able to run those reproducers as well.
For compression it can play a big role whether your file is band or pixel interleaved and whether or not it is tiled. gdal_translate might copy this info from the source file, while your code is not.
Originally posted at discourse, full text below including test code and links to sample data. I opened the issue here because visr suggested the problem might be in GDAL.jl instead of ArchGDAL. https://discourse.julialang.org/t/inefficient-compression-of-very-large-geotiffs-with-archgdal-compared-to-gdal-binaries/70687/1
I'm using ArchGDAL to save climate data as large 3D GeoTIFFs, but I've noticed that files created in ArchGDAL are roughly 3x larger than those created with gdal_* binaries. As a diagnostic I eliminated all my calculations and wrote an ArchGDAL function that just resaves its input file using a given compression method. Then I wrapped this in a test function that also recompresses the ArchGDAL output using gdal_translate and compares resulting file sizes.
Spoiler: small input GeoTIFFs have the same output size, but very large input files (4 GB) become much larger after passing through ArchGDAL.
First my test code:
Running the tests using ZSTD compression:
My 5 test files are available in a Box folder here. I get similar results for other compression methods. Some valid values are DEFLATE, LZW and NONE. My test results for these are in a log file in that Box folder. My installed versions: ArchGDAL v0.7.4 and GDAL_jll v300.202.100+0.
I suspect the problem is some kind of interaction with the BIGTIFF setting, which is required for creating GeoTIFFs larger than 4 GB.