Closed maxfreu closed 2 years ago
Hi @maxfreu,
out-of-the-box, it currently is not possible to achieve this.
However, if changing some lines of code, and re-compiling is an option for you:
in src/cross-level/brick-cl.c
, ll. 740ff, you can add or change the GDAL creation options and make a custom build.
Hope this helps, David
case _FMT_GTIFF_:
driver = GDALGetDriverByName("GTiff");
options = CSLSetNameValue(options, "COMPRESS", "LZW");
options = CSLSetNameValue(options, "PREDICTOR", "2");
options = CSLSetNameValue(options, "INTERLEAVE", "BAND");
options = CSLSetNameValue(options, "BIGTIFF", "YES");
if (brick->cx > 0){
nchar = snprintf(xchunk, NPOW_08, "%d", brick->cx);
if (nchar < 0 || nchar >= NPOW_08){
printf("Buffer Overflow in assembling BLOCKXSIZE\n"); return FAILURE;}
options = CSLSetNameValue(options, "BLOCKXSIZE", xchunk);
}
if (brick->cy > 0){
nchar = snprintf(ychunk, NPOW_08, "%d", brick->cy);
if (nchar < 0 || nchar >= NPOW_08){
printf("Buffer Overflow in assembling BLOCKYSIZE\n"); return FAILURE;}
options = CSLSetNameValue(options, "BLOCKYSIZE", ychunk);
}
break;
Yes, that helps, thank you! However, it might make sense to expose these options to the user, as such basic things dramatically impact the access speed.
Depends on what you are doing and how you access the data. For the access pattern that FORCE uses in the higher level processing, the default one is fairly optimal.
In addition, exposing every small parameter has the downside of overwhelming the user...
I will think about this one, though. I will probably implement a specific COG output option, which will kind of give you what you require. It will be fairly close to the default GTiff though. The only major difference is block size, and pixel interleave (although I have some reservations about pixel interleaving as all GDAL reads are band-wise on the lowest programmatical level).
I think band-wise interleaving is ok, unless you always read all bands. I just made good experiences with deflate compression, zlevel 1, the horizontal predictor and smaller block sizes - resulting in smaller files (despite low zlevel) and much faster reads. You might want to try it out :)
When using GDAL, it does not really matter if you read all bands or not because the data access is band-based. You first request a band, and then read a spatial subset like this:
dataset = GDALOpenEx(file, GDAL_OF_READONLY, NULL, NULL, NULL);
band = GDALGetRasterBand(dataset, 0);
GDALRasterIO(band, GF_Read,
xoff_disc, yoff_disc, nx_disc, ny_disc,
read_buf, nx_read, ny_read, GDT_Int16, 0, 0)
Thus, even when reading all bands, each consecutive read (for as many as we have bands) skips bytes if we have pixel-interleave... Eventually, we have read a whole block of consecutive data, but the individual accesses were fragmented nonetheless.
For the compression: much faster? Do you have some benchmark for this? From my experience so far, Deflate and LZW are not too different. See also here: https://kokoalberti.com/articles/geotiff-compression-optimization-guide/
If you want faster access, zstd compression might be worth looking into?
Thus, even when reading all bands, each consecutive read (for as many as we have bands) skips bytes if we have pixel-interleave... Eventually, we have read a whole block of consecutive data, but the individual accesses were fragmented nonetheless.
Oh good to know! That sounds inefficient and I wonder why there are no two distinct read modes...
For the compression: much faster? Do you have some benchmark for this?
Unfortunately I didn't write a proper benchmark, but here is one along with code here. I think the majority of the speedup I observed came from the smaller block sizes I used, as I often access much less than a tile's width.
If you want faster access, zstd compression might be worth looking into?
I don't know why I didn't test that, but according to the link above it seems fast! Especially the combination zstd, zstd_level = 1, predictor = 2 looks like an interesting tradeoff between read speed and compression, while still getting almost double the lzw write speed. This, however, depends a lot on the data. Maybe I can re-run above benchmark with some of the tiles stored in the CODE-DE datacube.
The benchmark you sent is the same one I posted above. LZW and DEFLATE are more or less in the same ballpark. But I believe I had some rationale of not using DEFLATE. As far as I remember, DEFLATE didn't work with all software packages.
Do you mean the CODE-DE datacube under /code-de/community/force ? Please note that access to this datacube is still preliminary and that reading speed is still of concern...
The benchmark you sent is the same one I posted above.
Oh sorry, seems like I overread it.
Do you mean the CODE-DE datacube under /code-de/community/force ? Please note that access to this datacube is still preliminary and that reading speed is still of concern...
Yes, exactly. I copied some files onto a local SSD for testing. The access speeds there, especially listing stuff, is quite bad indeed. I'll get back to you when the benchmark is done, but it seems like my GDAL comes without ZSTD support...
hmm, another thing you probably need to consider in this case:
Using an SSD as data storage is not realistic in all cases. With SSD, your performance will mostly depend on how fast your CPU can crack the compression - less so on reading throughput. I would also suggest you to additionally measure the performance when parallelly reading multiple files (bc decompression algorithms are usually parallelized, thus when you parallelize elsewhere, i.e. take away cores from the decompressing algo, things might shift).
Here is the benchmark carried out with three randomly picked sentinel 2 image tiles from different locations. Numbers are averaged across the three images. The benchmark tests gdal_translate, so the numbers for "write" include a read from uncompressed data on disk and vice versa. Data was read and written to the same HDD (not SSD), the CPU is a Xeon at max 3.7Ghz, only one core was used. Repetition count is 3. If set, the tiling was 256x256 pixels.
I think ZSTD with level 3, horizontal predictor and tiling is the sweet spot. A 20TB datacube with comparable imagery would shrink by around 3TB or 16%, while the read and write speed approximately doubles. The access to small subsets of the data will be sped up even more by the tiled compression.
Certainly there are many more considerations when choosing the compression settings, but if I as a user were given the choice, I'd go for the settings above, because my downstream software & code can handle it.
Dear @maxfreu.
I thought some time about implementing custom GDAL options and decided it would indeed be beneficial for some expert users.
I put some time into making your request possible. You will find a new force branch here: https://github.com/davidfrantz/force/tree/gdal_options
To enable this, quite some substantial changes in the belly of the beast were necessary. These changes should affect the entire FORCE codebase, e.g. force-level2 and force-higher-level. Thus, I would like to treat this branch as a release candidate, but won't merge immediately. It would be really nice if you could do some testing with this - both the custom and built-in format.
There is now a new parameter: FILE_OUTPUT_OPTIONS
. This expects a file, and will only be activated when OUTPUT_FORMAT = CUSTOM
.
The text file should be styled in tag and value notation like this:
DRIVER = GTiff
EXTENSION = tif
BIGTIFF = YES
COMPRESS = ZSTD
PREDICTOR = 2
ZLEVEL = 1
INTERLEAVE = BAND
TILED = YES
BLOCKXSIZE = 1024
BLOCKYSIZE = 1024
Important: the file needs at least the DRIVER
(GDAL short driver name) and EXTENSION
, and then a variable number of GDAL options (up to 32 - this should be enough, right?).
Some thoughts of caution: with opening this up to the user, it is now possible to give invalid or conflicting options that result in the failure of creating files.
My first tests with ZSTD are indeed promising! Though I would like to test this more rigorously before making it the default, especially since many GDAL installations might not support this yet. ZSTD was introduced in v. 2.3, which is only available since Ubuntu 20.04 LTS in the main repository. I would also like to test whether this compression works with other image processing software that is commonly used in the field (ENVI, Arc, ERDAS, etc.., - at least QGIS seems to be fine though!).
Cheers, David
Wow, sounds great! Thank you :) 32 options sounds enough. I will try it out as soon as possible and report back.
Question: Is force processing at all affected by the block size of images? E.g. are there any statistics calculated block-wise, which would become useless if blocks are too small?
Btw: I started converting our local datacube to zstd, resulting in 19% space savings. Accessing single pixels in the L2 cube for all points in time at once is now 13 times faster with a block size of 128.
Sounds promising. FORCE uses its standard blocks (strips) as processing units for the higher level processing. See here: https://force-eo.readthedocs.io/en/latest/components/higher-level/hl-compute.html
Reading compressed data on block boundaries should be more efficient in theory. But if ZSTD compensates for this, this might indeed be a good alternative. I am currently asking our lab members to open ZSTD with their favorite software - I am curious what the outcome will be :)
I received some quick feedback from colleagues:
Actually I have to report that it dosn't work for me for some reason. When I run force-level2
the per-image logs always tell me "parsing metadata failed", no other errors. Btw: I had to patch the makefile and manually include numpy/ndarrayobject.h, but that's unlikely the source of error I think. So how can we track this down? With the master branch it works.
Do you by any chance try to process a Landsat 9 image? The branch with the GDAL options was forked before the Landsat 9 branch.
numpy: this can happen if your python installation is not in the default path. This is very hard for me to control..
Do you by any chance try to process a Landsat 9 image?
No, only Sentinel 2 A/B.
numpy: this can happen if your python installation is not in the default path. This is very hard for me to control.
Default centos installation path, but no worries, after adapting it compiles. However I could imagine that the source of error is some misconfiguration of the machine.
L1 download file:
#!/bin/bash
p="/data_hdd/force"
metadata_dir="$p/metadata"
l1_dir="$p/l1/ukraine"
queue_file="$l1_dir/queue.txt"
aoi="29.9/51.54,29.9/49.9,31.4/49.9,31.4/51.54,29.9/51.54"
t_start="20220101"
t_end="20220302"
echo $l1_dir
force-level1-csd -u -s S2A,S2B $metadata_dir
force-level1-csd -c 0,30 -d $t_start,$t_end -s S2A,S2B -k $metadata_dir $l1_dir $queue_file $aoi
Output options file:
DRIVER = GTiff
EXTENSION = tif
BIGTIFF = YES
COMPRESS = ZSTD
PREDICTOR = 2
ZLEVEL = 3
INTERLEAVE = BAND
TILED = YES
BLOCKXSIZE = 128
BLOCKYSIZE = 128
Would compiling in debug mode output more info?
"parsing metadata failed" means that there is some error in reading the input image's metadata. This is likely not related to the parameterization.
In this case, running FORCE in debug mode might be a good idea indeed. If you do, please do not use force-level2, but rather use force-l2ps on a single image that throws this error message.
When I run force-l2ps ./l1/ukraine/T35UPS ./param/param-ukraine.prm
, I get following error message in debug mode:
<param file content>
check that all meta is initialized, brick as well?
there are still some things to do int meta. checking etc
unknown Satellite Mission. Parsing metadata failed.
When I run force-l2ps ./l1/ukraine/T35UPS/S2A_MSIL1C_20220222T092031_N0400_R093_T35UPS_20220222T113539.SAFE/ ./param/param-ukraine.prm
I get:
Next week I'll maybe try it on another machine.
The second call is correct. I will need to look at this image and check whether I can reproduce this here...
yep, there was a tiny bug indeed. Can you try again? It should work now
Well, now it does something with the above output options, but there still seems to be an error with tiling:
force-l2ps ./l1/ukraine/T36UUA/S2A_MSIL1C_20220213T085051_N0400_R107_T36UUA_20220213T094348.SAFE/ ./param/param-ukraine.prm
dc: 32.14%. wc: 3.81%. sc: 6.63%. cc: 0.59%. AOD: 0.1211. # of targets: 92/540.
Error creating file /data_hdd/force/l2/ukraine/X0011_Y0036/20220213_LEVEL2_SEN2A_QAI.tif. Error creating file /data_hdd/force/l2/ukraine/X0011_Y0036/20220213_LEVEL2_SEN2A_BOA.tif.
...
error in cubing Level 2 products
Tiling images failed! Error in geometric module.
Only lock files are produced.
Custom GDAL options used?
With opening up the settings, there is no guarantee anymore that files can be created... I have observed this when playing around with the block sizes. Not all were accepted by GDAL.
I used these exact settings with gdal_translate
. But wait: I think now the other issue with the custom compilation kicks in. I forgot that the system gdal does not support ZSTD, so I have to link my conda env during compilation... Sorry for the noise.
Ok, now I got it working. The files turn out to have exactly the desired settings. A bit unexpected, but the processing time went up for me, compared to LZW (150s vs 120s on avg). I expect this is due to the small tile size of 128 pixels.
So essentially I can confirm this is good to go.
I just merged the branch into develop. This feature is now officially available and will sip into the main branch soon, too.
Thanks for your input again! David
Hi! I would like to save my L2 processing output as DEFLATE compressed tifs with a block size of 256x256 (instead of one wide strip) and horizontal differencing. Is that possible, and if yes, how?