I am attempting to register a NetCDF for climate data using the registrar. The data registers correctly into the database, however when the registered data is selected to be rendered on the map I receive a 400 response and the error code: Failed to parse NETCDF: prefix string into expected 2, 3 or 4 fields. and the data is not displayed on the map. The issue seems to stem from the inclusion of the band index in the subdataset locator string when calling gdal.Open() to read the file.
Method to Reproduce
Steps to reproduce the behavior:
Deploying EOxViewServer via Helm Chart with collections and product types set up for the selected climate data
Using NetCDF data available in a STAC catalog here
Registering the NetCDF file, either once copied into the cluster or from an S3 bucket, using the following CLI command: kubectl exec -it -n eoxviewserver deployment/eoxviewserver-registrar -- python3 /var/www/pvs/dev/pvs_instance/manage.py timeseries register --collection UKCP --storage data_s3 --path "clt_rcp85_land-cpm_uk_5km_01_day_20601201-20701130.nc" --product-type-name UKCPCLT --x-dim-name "projection_x_coordinate" --y-dim-name "projection_y_coordinate" --time-dim-name "time" --coverage-type-mapping "clt:UKCP_2060" --product-template "{collection_identifier}_{file_identifier}_{index}"
The data registers correctly and can be viewed in the Django Admin pages
On the Client page, select the appropriate layer and navigate to the spatial and temporal location of the data and I can see the outline for the selected data
Selecting the correct browse type to request display of the ingested NetCDF data, data is not displayed on the map
Using the 'inspect' functionality in my browser I can identify the WMS request sent by EOxViewServer to receive the climate data. Executing this request using Curl outputs the above Failed to Parse error.
Looking in the Django Admin page, I am able to delete the index value from the 'Subdataset Locator' field and this causes the first timeslice (band 1) of the data to be correctly displayed onto the map. e.g. deleting the bold section in :clt :3599 .
Expected Behaviour
I would expect each individual time slice from the NetCDF to be rendered on the map when the timeline selection so requires. I would expect the subdataset locator to be used to identify which array from the NetCDF should be rendered at any selected time.
My Understanding
From looking at the source code, it seems the error is caused by the call to gdal.Open with the following string: NETCDF:"/vsis3/<bucket_name>/<netcdf_file_name>":<variable>:<index>. But gdal is not able to handle the additional index value appended to the string. If this functionality is not supported by gdal, perhaps instead it makes sense to load the entire dataset and then extract individual bands as required for example:
ds = gdal.Open(NETCDF:"/vsis3/<bucket_name>/<netcdf_file_name>":<variable>") ds = gdal.GetRasterBand(<index>).GetDataset()render(ds)
Desktop:
OS: Ubuntu 20.04.6 LTS
Source code forked from latest eoxserver master branch
Additional Information
I also made an update to the contrib/gdal.py script on line 186 to ensure that the index value is not included when identifying the shape of the data, as this previously raised the same error and prevented the data being registered at all. I added the following to the code: path = path[:-2] if path[-2:] == ":0" else path, as this again lead to the same error being raised: Failed to parse NETCDF: prefix string into expected 2, 3 or 4 fields.
Description
I am attempting to register a NetCDF for climate data using the registrar. The data registers correctly into the database, however when the registered data is selected to be rendered on the map I receive a 400 response and the error code:
Failed to parse NETCDF: prefix string into expected 2, 3 or 4 fields.
and the data is not displayed on the map. The issue seems to stem from the inclusion of the band index in the subdataset locator string when calling gdal.Open() to read the file.Method to Reproduce
Steps to reproduce the behavior:
kubectl exec -it -n eoxviewserver deployment/eoxviewserver-registrar -- python3 /var/www/pvs/dev/pvs_instance/manage.py timeseries register --collection UKCP --storage data_s3 --path "clt_rcp85_land-cpm_uk_5km_01_day_20601201-20701130.nc" --product-type-name UKCPCLT --x-dim-name "projection_x_coordinate" --y-dim-name "projection_y_coordinate" --time-dim-name "time" --coverage-type-mapping "clt:UKCP_2060" --product-template "{collection_identifier}_{file_identifier}_{index}"
Expected Behaviour
I would expect each individual time slice from the NetCDF to be rendered on the map when the timeline selection so requires. I would expect the subdataset locator to be used to identify which array from the NetCDF should be rendered at any selected time.
My Understanding
From looking at the source code, it seems the error is caused by the call to gdal.Open with the following string:
NETCDF:"/vsis3/<bucket_name>/<netcdf_file_name>":<variable>:<index>
. But gdal is not able to handle the additional index value appended to the string. If this functionality is not supported by gdal, perhaps instead it makes sense to load the entire dataset and then extract individual bands as required for example:ds = gdal.Open(NETCDF:"/vsis3/<bucket_name>/<netcdf_file_name>":<variable>")
ds = gdal.GetRasterBand(<index>).GetDataset()
render(ds)
Desktop:
Additional Information
I also made an update to the contrib/gdal.py script on line 186 to ensure that the index value is not included when identifying the shape of the data, as this previously raised the same error and prevented the data being registered at all. I added the following to the code:
path = path[:-2] if path[-2:] == ":0" else path
, as this again lead to the same error being raised:Failed to parse NETCDF: prefix string into expected 2, 3 or 4 fields
.