cogeotiff / rio-cogeo

Cloud Optimized GeoTIFF creation and validation plugin for rasterio
https://cogeotiff.github.io/rio-cogeo/
BSD 3-Clause "New" or "Revised" License
308 stars 42 forks source link

add additional profiles #251

Closed AndrewAnnex closed 1 year ago

AndrewAnnex commented 1 year ago

In this PR I've made a few additional profiles that I've developed over the past year for planetary datasets (meter-resolution DEMs and grayscale 1-band byte imagery) as informed by this excellent blog post https://kokoalberti.com/articles/geotiff-compression-optimization-guide/.

The biggest set of contributions are the additional LERC profiles that include the MAX_Z_ERROR parameters for 1cm, 10cm, and 25cm resolutions which are appropriate for meter-resolution elevation products. I picked these values as they are close to, but finer than the theoretical resolution of the more common elevation data sets encountered in planetary science, and is likely applicable to many earth-datasets save drone SfM products.

Users attempting to encode data with precisions worse-than 50cm are probably better encoded the data as Int16. And precisions finer than 1cm starts to result in diminishing returns as well for LERC (although a 1mm lerc profile could be worth adding).

I've tested this branch locally by running a few of the new profiles against products and comparing to versions I've made by hand with gdal_translate. It appears to be working well although I haven't tested it every single new profile.

I'd accept the criticism that this adds a lot of new profiles, maybe we can think about better ways to name them or intelligently selecting predictor 2 and 3 depending on the input/output data type.

One thing I should check on is if the floating point predictor can be applied to LERC, as that would further improve those profiles.

AndrewAnnex commented 1 year ago

looks like the predictor flag is ignored with LERC compression based on a test I just ran (md5sums matched for two files with otherwise identical -co options), hope to find some documentation related to that at some point

vincentsarago commented 1 year ago

@AndrewAnnex I'm not sure LERC is available by default in the rasterio wheels you may need to test with a custom GDAL env

EDIT: It's not in rasterio wheels https://github.com/rasterio/rasterio-wheels/issues/94

AndrewAnnex commented 1 year ago

@vincentsarago I was using a conda environment and have been using lerc successfully for a while (including when I tested these changes locally). This project already has a LERC profile though so it shouldn't be an issue?

AndrewAnnex commented 1 year ago

@vincentsarago ugh and now I reminded why I prefer to stay away from c ... okay as I understand it now, LERC was supported inadvertently in the rasterio pypi wheel before and support was removed with a change to how libtiff is used/packaged. Following the trail of issues it was listed as a no-fix a while-ago due to a issue with LERC's build setup that was fixed at some point. Potentially requesting LERC being reconsidered for rasterio-wheels would now be tractable?

LERC is available in the conda-forge distributions of gdal so it is available to be used by rasterio in that environment. So maybe worth asking again to bring parity between the two distributions

AndrewAnnex commented 1 year ago

LERC pypi issue being worked on in https://github.com/rasterio/rasterio-wheels/pull/100

jlaura commented 1 year ago

@AndrewAnnex I wonder if you might also consider the following that is being used for HiRISE data, taking it to 32-bit. This uses LERC and is designed to have a final precision that is within the accuracy of the HiRISE calibration, thereby making the compressed HiRISE effectively lossless.

class LERCDEFLATEProfileHiRISE(Profile):
    """Tiled, pixel-interleaved, LERC_DEFLATE-compressed GTiff."""

    defaults = {
        "driver": "GTiff",
        "interleave": "pixel",
        "tiled": True,
        "blockxsize": 512,
        "blockysize": 512,
        "compress": "LERC_DEFLATE",
        "MAX_Z_ERROR": 0.0001,
    }
AndrewAnnex commented 1 year ago

@jlaura I've added a profile for 10^-4 called "decimilli" and renamed all the profiles to otherwise standard metric SI prefixes as I want the names to be generic as possible for now. One could add instrument/product specific profiles but for now want to avoid that

AndrewAnnex commented 1 year ago

I also removed the 25cm profile, it's still useful relative to the deci profile but I don't know what to call it.

AndrewAnnex commented 1 year ago

@vincentsarago The 1.3.7 rasterio that was just released includes the changes from https://github.com/rasterio/rasterio-wheels/pull/100 so I think the issue about LERC availability is solved. I bumped the rasterio min version, so I think this also means that a newer gdal version is being used as well

vincentsarago commented 1 year ago

@AndrewAnnex first let me say that I really appreciate the effort and contribution around all projects.

About this PR I'm not sure we need to add more profile while they can already be extended quite easily by users either in the CLI or the API level.

Each driver having a set of options, we could open the door of having not an infinite but a large set of profiles.

At the same, if you're ok with ☝️ we could add a note in the documentation about how to extent the cog_profiles

from rasterio.profiles import Profile
from rio_cogeo.profiles import cog_profiles

class LERCDEFLATEProfiledeci(Profile):
    """Tiled, pixel-interleaved, LERC_DEFLATE-compressed GTiff."""

    defaults = {
        "driver": "GTiff",
        "interleave": "pixel",
        "tiled": True,
        "blockxsize": 512,
        "blockysize": 512,
        "compress": "LERC_DEFLATE",
        "MAX_Z_ERROR": 0.1,
    }

cog_profiles.update({"lerc_deflate_deci": LERCDEFLATEProfiledeci()})
AndrewAnnex commented 1 year ago

@vincentsarago I think that's reasonable, closing this pr!