Closed weiji14 closed 3 years ago
Thanks @weiji14 for opening this issue. When we implemented open_rasterio
, I don't think we intended this to be used for netCDF data so I'm surprised it works (worked) at all. I think we'll need someone to dig into the xarray implementation a bit to see if we can pull out the correct coordinates (metadata) using rasterio.
Well open_rasterio
did have an "experimental" warning on it in the docs :laughing:, but it was really nice having it work on GeoTIFFs and NetCDF files. I've forked the repo and will try to debug the situation a bit more. If anyone who's worked on that part of the codebase before has any pointers on what might be the cause of this issue / where to start that would be great.
Hold on, the coordinates seems to be parsed out correctly from the netCDF file (even with netCDF==1.5.1.2) when I have a clean conda installation created following the instructions at https://xarray.pydata.org/en/latest/contributing.html#creating-a-python-environment.
I've isolated the issue and think the problem arises when I also import salem
(an xarray accessor)... Will try to narrow this down before I close this issue.
I guess GDAL can also read netCDF files :).
Indeed, we did not anticipate people using open_rasterio
this way.
GDAL can open so many formats (https://gdal.org/drivers/raster/index.html). We've really only tested against TIFF like formats though. I think the assumption will need to be is that rasterio will provide standard metadata for each of these drivers.
Yes, there's https://gdal.org/drivers/raster/netcdf.html :smile: I've done a bit more debugging (having temporarily isolated salem from my script) and am still having issues with my setup.
The clean xarray-tests conda environment that works with netcdf==1.5.1.2
has libnetcdf: 4.6.2
, but for some strange reason, running xr.show_versions()
on my setup shows libnetcdf: 4.6.3
even though conda list | grep libnetcdf
shows that I've installed libnetcdf 4.6.2 h056eaf5_1002 conda-forge
.
Not sure if this libnetcdf 4.6.3 version is the problem, but it stands out the most (to me at least) when looking at the diff between my setup and the clean one. Is there a way to check the order in which xarray looks for the netcdf binaries as I feel it might be a PATH related issue. Also not sure if this issue fits here in xarray
or somewhere else now...
The clean xarray-tests conda environment that works with
netcdf==1.5.1.2
haslibnetcdf: 4.6.2
, but for some strange reason, runningxr.show_versions()
on my setup showslibnetcdf: 4.6.3
even thoughconda list | grep libnetcdf
shows that I've installedlibnetcdf 4.6.2 h056eaf5_1002 conda-forge
.
xr.show_versions()
gets its libnetcdf version from netCDF4 (specially netCDF4.__netcdf4libversion__
). So I'm guessing that somehow netCDF4 is picking up libnetcdf from somewhere else -- maybe you pip installed it? It might be worth trying another fresh conda environment...
xr.show_versions()
gets its libnetcdf version from netCDF4 (speciallynetCDF4.__netcdf4libversion__
). So I'm guessing that somehow netCDF4 is picking up libnetcdf from somewhere else -- maybe you pip installed it? It might be worth trying another fresh conda environment...
I'm not sure where it's picking up the libnetcdf 4.6.3 version from, but I found your comment at https://github.com/pydata/xarray/issues/2535#issuecomment-445944261 and think it might indeed be an incompatibility issue with rasterio and netCDF4 binary wheels (do rasterio wheels include netcdf binaries?). Probably somewhat related to https://github.com/mapbox/rasterio/issues/1574 too.
Managed to get things to work by combining the workaround in this Pull Request and StackOverflow post, basically having pip compile the netcdf
python package from source instead of using the wheel:
HDF5_DIR=$CONDA_PREFIX pip install --no-binary netCDF4 netCDF4==1.4.2
where $CONDA_PREFIX is the path to the conda environment e.g. /home/jovyan/.conda/envs/name-of-env
. I've tested my MCVE code sample above and it works up to the latest netCDF4==1.5.1.2
version!
I'm seeing a problem with geotiffs that I think might be related:
$ gdalinfo /home/nedcr/cr/software/ana/lib/geo/tests/data/Urandangi_MGA55.tif
Driver: GTiff/GeoTIFF
Files: /home/nedcr/cr/software/ana/lib/geo/tests/data/Urandangi_MGA55.tif
/home/nedcr/cr/software/ana/lib/geo/tests/data/Urandangi_MGA55.tif.aux.xml
Size is 10, 10
Coordinate System is:
PROJCS["GDA94 / MGA zone 55",
GEOGCS["GDA94",
DATUM["Geocentric_Datum_of_Australia_1994",
SPHEROID["GRS 1980",6378137,298.257222101,
AUTHORITY["EPSG","7019"]],
TOWGS84[0,0,0,0,0,0,0],
AUTHORITY["EPSG","6283"]],
PRIMEM["Greenwich",0,
AUTHORITY["EPSG","8901"]],
UNIT["degree",0.0174532925199433,
AUTHORITY["EPSG","9122"]],
AUTHORITY["EPSG","4283"]],
PROJECTION["Transverse_Mercator"],
PARAMETER["latitude_of_origin",0],
PARAMETER["central_meridian",147],
PARAMETER["scale_factor",0.9996],
PARAMETER["false_easting",500000],
PARAMETER["false_northing",10000000],
UNIT["metre",1,
AUTHORITY["EPSG","9001"]],
AXIS["Easting",EAST],
AXIS["Northing",NORTH],
AUTHORITY["EPSG","28355"]]
Origin = (-406507.954209543997422,7588834.152862589806318)
Pixel Size = (263.500000000000000,-265.500000000000000)
Metadata:
AREA_OR_POINT=Area
Image Structure Metadata:
INTERLEAVE=BAND
Corner Coordinates:
Upper Left ( -406507.954, 7588834.153) (138d16' 6.99"E, 21d34'24.95"S)
Lower Left ( -406507.954, 7586179.153) (138d16' 1.84"E, 21d35'50.30"S)
Upper Right ( -403872.954, 7588834.153) (138d17'37.56"E, 21d34'29.73"S)
Lower Right ( -403872.954, 7586179.153) (138d17'32.42"E, 21d35'55.09"S)
Center ( -405190.454, 7587506.653) (138d16'49.70"E, 21d35'10.02"S)
Band 1 Block=10x10 Type=Float32, ColorInterp=Gray
Min=0.069 Max=8.066
Minimum=0.069, Maximum=8.066, Mean=2.556, StdDev=1.749
NoData Value=-3.40282306073709653e+38
Metadata:
STATISTICS_MAXIMUM=8.06591796875
STATISTICS_MEAN=2.5563781395387
STATISTICS_MINIMUM=0.068740844726562
STATISTICS_STDDEV=1.7493082797107
STATISTICS_VALID_PERCENT=89
import xarray as xr
from osgeo.gdal import Open
ras = Open('/home/nedcr/cr/software/ana/lib/geo/tests/data/Urandangi_MGA55.tif')
ras.GetGeoTransform()
# (-406507.954209544, 263.5, 0.0, 7588834.15286259, 0.0, -265.5)
ds = xr.open_rasterio('/home/nedcr/cr/software/ana/lib/geo/tests/data/Urandangi_MGA55.tif')
ds.transform
# (263.5, 0.0, -406507.954209544, 0.0, -265.5, 7588834.15286259)
The transform in the xarray dataset is transposed, and not really useful anymore.
xr.show_versions()
@naught101 Looks like those coordinates are now in the order as declared in the affine package (since rasterio >= 1.0). See also: https://rasterio.readthedocs.io/en/latest/topics/migrating-to-v1.html.
All coordinates are created correctly, so from my point of view, only the docs here have to be changed.
from affine import Affine
da = xr.open_rasterio('path_to_file.tif')
transform = Affine(*da.attrs['transform'])
nx, ny = da.sizes['x'], da.sizes['y']
x, y = np.meshgrid(np.arange(nx)+0.5, np.arange(ny)+0.5) * transform
Maybe this can be closed now, since #5021 changed the docs?
@naught101 Looks like those coordinates are now in the order as declared in the affine package (since rasterio >= 1.0). See also: https://rasterio.readthedocs.io/en/latest/topics/migrating-to-v1.html.
All coordinates are created correctly, so from my point of view, only the docs here have to be changed.
from affine import Affine da = xr.open_rasterio('path_to_file.tif') transform = Affine(*da.attrs['transform']) nx, ny = da.sizes['x'], da.sizes['y'] x, y = np.meshgrid(np.arange(nx)+0.5, np.arange(ny)+0.5) * transform
yes, I would think so. Thanks for the reminder, @gabriel-abrahao.
The original issue was that libnetcdf picked up by netCDF4
didn't match the version installed by the netCDF4
wheels, which is not really xarray
's fault.
MCVE Code Sample
Adapted from the test_serialization unit test at here.
Actual Output (using netCDF4>=1.4.2)
Expected Output (using netCDF4==1.4.1)
Problem Description
I have a script which takes in either NetCDF or GeoTIFF files as an input and I've been using
xr.open_rasterio
to read them quite successfully before, as I'm basically just interested in parsing the correct XY coordinates out. However, upgrading from NetCDF 1.4.1 to 1.4.2 breaks this behaviour (xr.open_rasterio
no longer parses the coordinates properly from a .nc file).My hunch is that it has something to do with different NetCDF4 formats (e.g. ‘NETCDF4’, 'NETCDF4_CLASSIC’, ‘NETCDF3_64BIT’) but looking at the
xr.open_rasterio
code, I'm not too sure what's going on... Also not very sure if this is an upstream netcdf4-python or rasterio issue but I thought I'd report it here first.Output of
xr.show_versions()