OSGeo / gdal

GDAL is an open source MIT licensed translator library for raster and vector geospatial data formats.
https://gdal.org
Other
4.75k stars 2.5k forks source link

how to avoid 4th band as mask with COG driver using Jpeg Compression #4853

Closed gmaillet closed 2 years ago

gmaillet commented 2 years ago

Expected behavior and actual behavior.

GDAL create perfect 4 bands COG with any compression, but there is a special behavior with Jpeg compression.

    if( EQUAL(osCompress, "JPEG") &&
        poCurDS->GetRasterCount() == 4 )
    {
        char** papszArg = nullptr;
        papszArg = CSLAddString(papszArg, "-of");
        papszArg = CSLAddString(papszArg, "VRT");
        papszArg = CSLAddString(papszArg, "-b");
        papszArg = CSLAddString(papszArg, "1");
        papszArg = CSLAddString(papszArg, "-b");
        papszArg = CSLAddString(papszArg, "2");
        papszArg = CSLAddString(papszArg, "-b");
        papszArg = CSLAddString(papszArg, "3");
        papszArg = CSLAddString(papszArg, "-mask");
        papszArg = CSLAddString(papszArg, "4");
        GDALTranslateOptions* psOptions = GDALTranslateOptionsNew(papszArg, nullptr);
        CSLDestroy(papszArg);
        GDALDatasetH hRGBMaskDS = GDALTranslate("",
                                                GDALDataset::ToHandle(poCurDS),
                                                psOptions,
                                                nullptr);
        GDALTranslateOptionsFree(psOptions);
        if( !hRGBMaskDS )
        {
            return nullptr;
        }
        m_poRGBMaskDS.reset( GDALDataset::FromHandle(hRGBMaskDS) );
        poCurDS = m_poRGBMaskDS.get();
    }

If this 4th band is a real one (for example infrared data), these data are replaced by a binary mask information.

Steps to reproduce the problem.

create any 4 bands Tiff Image: RGB_IR.tif create a COG LZW and a COG JPEG with gdal_translate

gdal_translate -of COG -co 'COMPRESS=LZW' RGB_IR.tif COG_LZW.tif
gdal_translate -of COG -co 'COMPRESS=JPEG' RGB_IR.tif COG_JPEG.tif

the 4th band of the LZW one is correct, but the 4th band of the JPEG one is a binary mask.

For example:

Operating system

MacOS 10.15.7

GDAL version and provenance

GDAL 3.2.1, released 2020/12/29

jratike80 commented 2 years ago

JPEG format supports only 1 band and 3 band images with an optional mask and so does GDAL https://gdal.org/drivers/raster/jpeg.html

The driver also supports the “zlib compressed mask appended to the file” approach used by a few data providers to add a bitmask to identify pixels that are not valid data. See RFC 15: Band Masks for further details.

JPEG compression is not usable for real 4 band data.

gmaillet commented 2 years ago

Ok, but it's not the behavior of the GTiff driver which is however very similar to the COG one, isn't it?

jratike80 commented 2 years ago

Good question. Perhaps I have understood something wrong. Actually JPEG and GDAL seem to support 4 real bands in writing if the color space is CMYK https://gdal.org/drivers/raster/jpeg.html#creation-options. Maybe COG and GTiff drivers make different interpretation about the nature of the source data and COG writes the fourth band as zlib compressed mask while GTiff writes is as K band of CMYK? Consistent and documented behavior would be desired.

I made a quick test

gdal_translate -of cog -co compress=jpeg 4bandtest.tif 4bandcog.tif
gdal_translate -of gtiff -co compress=jpeg 4bandtest.tif 4bandgtiff.tif

The COG version eats the 4th band and converts it into alpha that cannot be accessed as a separate band.

Band 1 Block=512x512 Type=Byte, ColorInterp=Red
  Overviews: 6000x6000, 3000x3000, 1500x1500, 750x750, 375x375
  Mask Flags: PER_DATASET
  Overviews of mask band: 6000x6000, 3000x3000, 1500x1500, 750x750, 375x375
Band 2 Block=512x512 Type=Byte, ColorInterp=Green
  Overviews: 6000x6000, 3000x3000, 1500x1500, 750x750, 375x375
  Mask Flags: PER_DATASET
  Overviews of mask band: 6000x6000, 3000x3000, 1500x1500, 750x750, 375x375
Band 3 Block=512x512 Type=Byte, ColorInterp=Blue
  Overviews: 6000x6000, 3000x3000, 1500x1500, 750x750, 375x375
  Mask Flags: PER_DATASET
  Overviews of mask band: 6000x6000, 3000x3000, 1500x1500, 750x750, 375x375

GTiff driver writes 4 bands (4th named blue because in my 4 band test file it comes as a copy of 3rd band of a RGB image)

Band 1 Block=12000x16 Type=Byte, ColorInterp=Red
Band 2 Block=12000x16 Type=Byte, ColorInterp=Green
Band 3 Block=12000x16 Type=Byte, ColorInterp=Blue
Band 4 Block=12000x16 Type=Byte, ColorInterp=Blue

The histograms of band 4 and band 3 are identical (as told, they are copies in source data) so the 4th band does contain real data and this GeoTIFF should be good for RGB-Nir data, for example.

To be sure that 4 band is the upper limit I made also a test with 5 bands.

gdal_translate -of gtiff -co compress=jpeg 5bandtest.tif 5bandgtiff.tif
Input file size is 12000, 12000
Warning 5: PHOTOMETRIC=RGB value does not correspond to number of bands (1), ignoring.  Set the Photometric Interpretation as MINISBLACK.
0...10...20...30...40...50...60...70...80...90...100 - done.
ERROR 1: JPEGLib:Too many color components: 5, max 4
ERROR 1: JPEGLib:Too many color components: 5, max 4
ERROR 1: WriteEncodedTile/Strip() failed.
ERROR 1: JPEGLib:Too many color components: 5, max 4
jratike80 commented 2 years ago

I tried to fool the COG driver gdal_translate -of cog -co compress=jpeg GTIFF_RAW:4bandtest.tif 4bandcog2.tif -co photometric=cmyk but driver does have creation option "photometric", and using GTIFF_RAW did not make any difference.

gmaillet commented 2 years ago

Thank you for your help. Considering the lines pointed out above, I'm afraid there is no way around the problem without a modification of the file cogdriver.cpp.

rouault commented 2 years ago

Fix queued in https://github.com/OSGeo/gdal/pull/4865

With PlanarConfiguration=Contiguous (INTERLEAVE=PIXEL), JPEG compression can support up to 4 channels. For 5 channels or more, PlanarConfiguration=Separate (INTERLEAVE=BAND) must be used to compress each channel separately, but the COG definition up to now doesn't allow this