Deltares / hydromt

HydroMT: Automated and reproducible model building and analysis
https://deltares.github.io/hydromt/
MIT License
74 stars 30 forks source link

Add debug message and documentation on GEBCO topography data #663

Open Tammo-Zijlker-Deltares opened 11 months ago

Tammo-Zijlker-Deltares commented 11 months ago

Kind of request

Adding new functionality

Enhancement Description

The setup options used are:

setup_maps_from_rasterdataset:
  raster_fn: gebco
  variables: ["elevtn"]
  fill_method: nearest
  interpolation_method: nearestNb

Use case

This example was used in development of hydromt-delft3dfm, specifically for a case where we develop a coastal 2D model.

When reading a variable from a dataset that is unavailable, a warning should be given to the user.

Additional Context

No response

veenstrajelmer commented 11 months ago

In order to use gebco, we currently need to append an extra data_catalog.yml with the following information:

gebco_elev:
  rename:
    gebco: elevtn
  unit_mult:
    elevtn: 1.0
  crs: 4326
  data_type: RasterDataset
  driver: raster
  meta:
    category: topography
    paper_doi: 10.5285/a29c5465-b138-234d-e053-6c86abc040b9
    paper_ref: Weatherall et al (2020)
    source_license: https://www.gebco.net/data_and_products/gridded_bathymetry_data/#a1
    source_url: https://www.bodc.ac.uk/data/open_download/gebco/gebco_2020/geotiff/
    source_version: 2020
    unit: m+MSL
    comment_rename: temporary entry since original gebco file only contains gebco varname, which should be elevtn. We rename it here
    comment_unit_mult: convert from int to float (to avoid FM error "GeoTIFF files with non-float raster bands are currently not supported")
  path: data/gebco.tif

It would be great if the artifact data_catalog could be updated with this information. However, there might be processes failing if doing that. Hope someone can judge this.

hboisgon commented 11 months ago

Hi @Tammo-Zijlker-Deltares and @veenstrajelmer. Thanks for posting the issue. We have an open issue to harmonize better data conventions in hydromt and I guess this is related. See #45

Where renaming might break something, maybe @roeldegoede can say something for hydromt-sfincs? I think you are the ones using this data.

In your specific case, you could also get around the issue this way

setup_maps_from_rasterdataset:
  raster_fn: gebco
  fill_method: nearest
  interpolation_method: nearestNb
  rename:
      gebco: elevtn

@Tammo-Zijlker-Deltares I guess you called this function from hydromt-delft3dfm so passing a xr.DataArray to hydromt core function? Because you may be right that something goes wrong and the error message is only checked for xr.Dataset and not xr.DataArray:

https://github.com/Deltares/hydromt/blob/32850540c6cdd6108e5c530cfad8641c2f3c168e/hydromt/data_adapter/rasterdataset.py#L456-L468

We can correct that

hboisgon commented 11 months ago

On the second point, with the data catalog we do not list all existing variables in a dataset but we just use the catalog to rename variables to hydromt names if they have different names. What we could do is add a list of available variables in the meta information of the data (part not used by hydromt) but not sure how useful this is. Also for example for some geodataframe data, you would basically need to list all columns as all columns are variables so this could get quite cumbersome... I think also good if users know in a way what data they are using. @Tammo-Zijlker-Deltares what is your thought on this?

For the dtype, we do not support dtype fixing or conversion with the data catalog but this new feature was requested by @xldeltares in #97

roeldegoede commented 11 months ago

In HydroMT-SFINCS, we are able to use "gebco" elevations right away without any additional renaming or extra data-catalogs.

Small piece of code that we used for this, with dataset being a dictionary like {"elevtn": "gebco"} when getting something from the data_catalog, or {"da": xr.DataArray} when using in-memory datasets

da_elv = self.data_catalog.get_rasterdataset( dataset.get("elevtn", dataset.get("da")), bbox=self.mask.raster.transform_bounds(4326), buffer=10, variables=["elevtn"], zoom_level=(res, "meter"), )

So for us, both the artifact-data and deltares-data work fine. Not sure what is happening on your end, but we can have a look together when needed.