Closed peterm790 closed 1 year ago
What is the output of rioxarray.show_versions()
?
Also, mind providing the full error traceback?
rioxarray.show_versions():
rioxarray (0.13.3) deps:
rasterio: 1.3.4
xarray: 2023.1.0
GDAL: 3.6.2
GEOS: 3.11.1
PROJ: 9.1.0
PROJ DATA: /share/apps/anaconda3-2022.05/envs/pangeo/share/proj
GDAL DATA: None
Other python deps:
scipy: 1.10.0
pyproj: 3.4.1
System:
python: 3.10.9 | packaged by conda-forge | (main, Feb 2 2023, 20:20:04) [GCC 11.3.0]
executable: /share/apps/anaconda3-2022.05/envs/pangeo/bin/python
machine: Linux-3.10.0-1160.80.1.el7.x86_64-x86_64-with-glibc2.17
MissingCRS Traceback (most recent call last)
Cell In[5], line 2
1 import rioxarray
----> 2 rds = rioxarray.open_rasterio(files[10], masked = True)
File /share/apps/anaconda3-2022.05/envs/pangeo/lib/python3.10/site-packages/rioxarray/_io.py:1115, in open_rasterio(filename, parse_coordinates, chunks, cache, lock, masked, mask_and_scale, variable, group, default_name, decode_times, decode_timedelta, band_as_variable, **open_kwargs)
1113 # open the subdatasets if they exist
1114 if riods.subdatasets:
-> 1115 subdataset_result = _load_subdatasets(
1116 riods=riods,
1117 group=group,
1118 variable=variable,
1119 parse_coordinates=parse_coordinates,
1120 chunks=chunks,
1121 cache=cache,
1122 lock=lock,
1123 masked=masked,
1124 mask_and_scale=mask_and_scale,
1125 decode_times=decode_times,
1126 decode_timedelta=decode_timedelta,
1127 **open_kwargs,
1128 )
1129 manager.close()
1130 return subdataset_result
File /share/apps/anaconda3-2022.05/envs/pangeo/lib/python3.10/site-packages/rioxarray/_io.py:791, in _load_subdatasets(riods, group, variable, parse_coordinates, chunks, cache, lock, masked, mask_and_scale, decode_times, decode_timedelta, **open_kwargs)
789 with rasterio.open(subdataset) as rds:
790 shape = rds.shape
--> 791 rioda: DataArray = open_rasterio( # type: ignore
792 subdataset,
793 parse_coordinates=shape not in dim_groups and parse_coordinates,
794 chunks=chunks,
795 cache=cache,
796 lock=lock,
797 masked=masked,
798 mask_and_scale=mask_and_scale,
799 default_name=subdataset.split(":")[-1].lstrip("/").replace("/", "_"),
800 decode_times=decode_times,
801 decode_timedelta=decode_timedelta,
802 **open_kwargs,
803 )
804 dim_groups[shape][rioda.name] = rioda
805 return _subdataset_groups_to_dataset(
806 dim_groups=dim_groups, global_tags=_parse_tags(riods.tags())
807 )
File /share/apps/anaconda3-2022.05/envs/pangeo/lib/python3.10/site-packages/rioxarray/_io.py:1224, in open_rasterio(filename, parse_coordinates, chunks, cache, lock, masked, mask_and_scale, variable, group, default_name, decode_times, decode_timedelta, band_as_variable, **open_kwargs)
1222 result.rio.write_crs(rio_crs, inplace=True)
1223 if has_gcps:
-> 1224 result.rio.write_gcps(*riods.gcps, inplace=True)
1226 if chunks is not None:
1227 result = _prepare_dask(result, riods, filename, chunks)
File /share/apps/anaconda3-2022.05/envs/pangeo/lib/python3.10/site-packages/rioxarray/rioxarray.py:1230, in XRasterBase.write_gcps(self, gcps, gcp_crs, grid_mapping_name, inplace)
1225 grid_mapping_name = (
1226 self.grid_mapping if grid_mapping_name is None else grid_mapping_name
1227 )
1228 data_obj = self._get_obj(inplace=True)
-> 1230 data_obj = data_obj.rio.write_crs(
1231 gcp_crs, grid_mapping_name=grid_mapping_name, inplace=inplace
1232 )
1233 geojson_gcps = _convert_gcps_to_geojson(gcps)
1234 data_obj.coords[grid_mapping_name].attrs["gcps"] = geojson_gcps
File /share/apps/anaconda3-2022.05/envs/pangeo/lib/python3.10/site-packages/rioxarray/rioxarray.py:487, in XRasterBase.write_crs(self, input_crs, grid_mapping_name, inplace)
484 pass
486 if data_obj.rio.crs is None:
--> 487 raise MissingCRS(
488 "CRS not found. Please set the CRS with 'rio.write_crs()'."
489 )
490 # add grid mapping coordinate
491 data_obj.coords[grid_mapping_name] = xarray.Variable((), 0)
MissingCRS: CRS not found. Please set the CRS with 'rio.write_crs()'.
This is interesting - it seems to think there are GCPS, but no CRS is found:
if has_gcps:
result.rio.write_gcps(*riods.gcps, inplace=True)
with rasterio.open("MOD11_L2.A2023060.2335.061.2023061090119.hdf") as rds:
for subdataset in rds.subdatasets:
with rasterio.open(subdataset) as subrds:
print(subdataset)
print(subrds.gcps[0][0], subrds.gcps[1])
HDF4_EOS:EOS_SWATH:MOD11_L2.A2023060.2335.061.2023061090119.hdf:MOD_Swath_LST:LST
GroundControlPoint(row=2.5, col=2.5, x=115.89305877685547, y=-62.6209831237793, z=0.0, id='', info='') None
HDF4_EOS:EOS_SWATH:MOD11_L2.A2023060.2335.061.2023061090119.hdf:MOD_Swath_LST:QC
GroundControlPoint(row=2.5, col=2.5, x=115.89305877685547, y=-62.6209831237793, z=0.0, id='', info='') None
HDF4_EOS:EOS_SWATH:MOD11_L2.A2023060.2335.061.2023061090119.hdf:MOD_Swath_LST:Error_LST
GroundControlPoint(row=2.5, col=2.5, x=115.89305877685547, y=-62.6209831237793, z=0.0, id='', info='') None
HDF4_EOS:EOS_SWATH:MOD11_L2.A2023060.2335.061.2023061090119.hdf:MOD_Swath_LST:Emis_31
GroundControlPoint(row=2.5, col=2.5, x=115.89305877685547, y=-62.6209831237793, z=0.0, id='', info='') None
HDF4_EOS:EOS_SWATH:MOD11_L2.A2023060.2335.061.2023061090119.hdf:MOD_Swath_LST:Emis_32
GroundControlPoint(row=2.5, col=2.5, x=115.89305877685547, y=-62.6209831237793, z=0.0, id='', info='') None
HDF4_EOS:EOS_SWATH:MOD11_L2.A2023060.2335.061.2023061090119.hdf:MOD_Swath_LST:View_angle
GroundControlPoint(row=2.5, col=2.5, x=115.89305877685547, y=-62.6209831237793, z=0.0, id='', info='') None
HDF4_EOS:EOS_SWATH:MOD11_L2.A2023060.2335.061.2023061090119.hdf:MOD_Swath_LST:View_time
GroundControlPoint(row=2.5, col=2.5, x=115.89305877685547, y=-62.6209831237793, z=0.0, id='', info='') None
This returns nothing for the GCPS:
from osgeo import gdal
ds = gdal.Open("HDF4_EOS:EOS_SWATH:MOD11_L2.A2023060.2335.061.2023061090119.hdf:MOD_Swath_LST:LST")
ds.GetGCPSpatialRef()
I am wondering if GDAL needs to be updated to handle that (https://github.com/OSGeo/gdal)? Or if it is an issue with how the files are generated?
@peterm790, with the gist you added, there are no coordinates/transform. So, there isn't much you can do with the data and it can be misleading to have a CRS without the correct coordinates/transform/gcps. I am thinking it would likely be better to either address the issue with the data provider or with GDAL.
However, to allow reading the file, a PR with a change that skips writing the CRS if gcp_crs
is None here would be welcome:
https://github.com/corteva/rioxarray/blob/33a1be6f1bc3e30cb6d8c1a002d04f439dfb510f/rioxarray/rioxarray.py#L1233-L1235
Thank you very much @snowman2! I think I understand the problem better now.
This MOD11_L2 data is version 6.1 which has only recently been released so could well be a source data problem. I will see if the problem persists with older versions. And follow on from there.
I will set up a new branch with suggested PR shortly. Thanks
So just to update this issue for anybody that may come here in the future. My understanding of the MOD11_L2 data was a bit flawed, while it does contain ground control points it is not a geo-referenced product and so attempting to re project does not make sense. Instead the level 3 MOD11A1 product should be used. This document was useful to me: https://landweb.modaps.eosdis.nasa.gov/QA_WWW/forPage/user_guide/061/MOD11_C61_UsersGuide_revSudipta_revPete_Final.pdf
Despite this I do think https://github.com/corteva/rioxarray/pull/648 which allows the data and gcps to be opened and viewed makes sense.
I am looking to open MODIS level 2 hdf data, sourced from here. I've also uploaded a sample file to s3://peterm790/MOD11_L2.A2023060.2040.061.2023061090549.hdf for convenience.
My understanding is after https://github.com/corteva/rioxarray/pull/62 it should be possible to open these files by simply calling:
This however returns the following error:
a gist with the full error message: https://gist.github.com/peterm790/6a3dd266def15b2a5bfa04a3c63825a6
My understanding is the present API does not allow for manually setting the CRS when opening a file. Is there a way around this? if not does it make sense to allow for this? Or should it be possible to open the file without setting the CRS?
I have set up a branch: https://github.com/corteva/rioxarray/compare/master...peterm790:rioxarray:assign_crs which allows the CRS to be set manually with an
assign_crs
argument inopen_rasterio
such that:successfully opens the files. here's a gist of this https://gist.github.com/peterm790/f06d846e518c2ced7741afcda5365aed
Please do let me know if there is another method for opening these MODIS L2 files that I have missed? Or if a PR allowing the above change would make sense.