Closed eliocamp closed 4 years ago
I have the same issue.
Can we check this occurs with a publicly accessible file, e.g.,
http://geog.uoregon.edu/GeogR/data/raster/cru10min30_tmp.nc
?
Alos, to make sure we're all on the same page, please use the current release of netcdf4
published on CRAN on 2019-10-23.
I am running this code:
library(reticulate)
# use_virtualenv("0105", T) # just some env
library(ncdf4)
xr <- import("xarray")
file <- "~/Downloads/cru10min30_tmp.nc"
dt <- xr$open_dataset(file)
dt
without getting an error.
Your code works, but I think it's because your file is not a NetCDF4 file, but a NetCDF3 one.
file3 <- path.expand("~/Downloads/cru10min30_tmp.nc")
file4 <- path.expand("~/Downloads/cru10min30_tmp.nc4")
url <- "http://geog.uoregon.edu/GeogR/data/raster/cru10min30_tmp.nc"
download.file(url, destfile = file3)
# Converts file to NetCDF4
# (Needs nco operators)
system(paste("ncks -4 -O ", file3, file4))
library(reticulate)
# use_virtualenv("0105", T) # just some env
library(ncdf4)
xr <- import("xarray")
# Opens NetCDF3 with no problem
dt3 <- xr$open_dataset(file3)
dt3
#> <xarray.Dataset>
#> Dimensions: (lat: 360, lon: 720, nv: 2, time: 12)
#> Coordinates:
#> * lon (lon) float64 -179.8 -179.2 -178.8 -178.2 ... 178.8 179.3 179.8
#> * lat (lat) float64 -89.75 -89.25 -88.75 -88.25 ... 88.75 89.25 89.75
#> * time (time) datetime64[ns] 1976-01-16T12:00:00 ... 1976-12-16T12:00:00
#> Dimensions without coordinates: nv
#> Data variables:
#> time_bounds (time, nv) float32 ...
#> tmp (time, lat, lon) float32 ...
#> Attributes:
#> data: CRU CL 2.0 1961-1990 Monthly Averages
#> title: CRU CL 2.0 -- 10min grid sampled every 0.5 degree
#> institution: http://www.cru.uea.ac.uk/
#> source: http://www.cru.uea.ac.uk/~markn/cru05/cru05_intro.html
#> references: New et al. (2002) Climate Res 21:1-25
#> history: Wed Oct 29 11:27:35 2014: ncrename -v climatology_bounds,ti...
#> Conventions: CF-1.0
# But not NetCDF4
dt4 <- xr$open_dataset(file4)
#> Error in py_call_impl(callable, dots$args, dots$keywords): TypeError: Error: /home/elio/Downloads/cru10min30_tmp.nc4 is not a valid NetCDF 3 file
#> If this is a NetCDF4 file, you may need to install the
#> netcdf4 library, e.g.,
#>
#> $ pip install netcdf4
#>
#>
#> Detailed traceback:
#> File "/home/elio/miniconda3/envs/r-reticulate/lib/python3.7/site-packages/xarray/backends/api.py", line 451, in open_dataset
#> ds = maybe_decode_store(store)
#> File "/home/elio/miniconda3/envs/r-reticulate/lib/python3.7/site-packages/xarray/backends/api.py", line 381, in maybe_decode_store
#> drop_variables=drop_variables, use_cftime=use_cftime)
#> File "/home/elio/miniconda3/envs/r-reticulate/lib/python3.7/site-packages/xarray/conventions.py", line 517, in decode_cf
#> vars, attrs = obj.load()
#> File "/home/elio/miniconda3/envs/r-reticulate/lib/python3.7/site-packages/xarray/backends/common.py", line 121, in load
#> for k, v in self.get_variables().items())
#> File "/home/elio/miniconda3/envs/r-reticulate/lib/python3.7/site-packages/xarray/backends/scipy_.py", line 166, in get_variables
#> for k, v in self.ds.variables.items())
#> File "/home/elio/miniconda3/envs/r-reticulate/lib/python3.7/site-packages/xarray/backends/scipy_.py", line 158, in ds
#> return self._manager.acquire()
#> File "/home/elio/miniconda3/envs/r-reticulate/lib/python3.7/site-packages/xarray/backends/file_manager.py", line 168, in acquire
#> file, _ = self._acquire_with_cache_info(needs_lock)
#> File "/home/elio/miniconda3/envs/r-reticulate/lib/python3.7/site-packages/xarray/backends/file_manager.py", line 192, in _acquire_with_cache_info
#> file = self._opener(*self._args, **kwargs)
#> File "/home/elio/miniconda3/envs/r-reticulate/lib/python3.7/site-packages/xarray/backends/scipy_.py", line 100, in _open_scipy_netcdf
#> raise TypeError(errmsg)
dt4
#> Error in eval(expr, envir, enclos): object 'dt4' not found
Created on 2020-01-08 by the reprex package (v0.3.0)
My guess is that xarray is not using the netcdf4 libary to read netcdf3 files, so loading {ncdf4} does not conflict with it.
And just to be sure, if {ncdf4} is not loaded, then everything runs smoothly with both NetCDF files.
file3 <- path.expand("~/Downloads/cru10min30_tmp.nc")
file4 <- path.expand("~/Downloads/cru10min30_tmp.nc4")
url <- "http://geog.uoregon.edu/GeogR/data/raster/cru10min30_tmp.nc"
download.file(url, destfile = file3)
# Converts file to NetCDF4
# (Needs nco operators)
system(paste("ncks -4 -O ", file3, file4))
library(reticulate)
xr <- import("xarray")
dt3 <- xr$open_dataset(file3)
dt4 <- xr$open_dataset(file4)
Created on 2020-01-08 by the reprex package (v0.3.0)
Thanks, I can reproduce! I'll keep you posted :-)
Actually - sorry, I answered too fast :-)
I get that same error even with ncdf4
NOT loaded...
Wait, are you getting the error when using the code of the last example even in a new session? :sob:
Yeah, exactly. And when I leave R out of the equation completely:
Python 3.7.5 (default, Dec 15 2019, 17:54:26)
[GCC 9.2.1 20190827 (Red Hat 9.2.1-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import xarray as xarr
>>> xarr.open_data("cru10min30_tmp.nc4")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: module 'xarray' has no attribute 'open_data'
>>> xarr.open_dataset("cru10min30_tmp.nc4")
Traceback (most recent call last):
File "/home/key/.virtualenvs/0105/lib64/python3.7/site-packages/xarray/backends/file_manager.py", line 198, in _acquire_with_cache_info
file = self._cache[self._key]
File "/home/key/.virtualenvs/0105/lib64/python3.7/site-packages/xarray/backends/lru_cache.py", line 53, in __getitem__
value = self._cache[key]
KeyError: [<function _open_scipy_netcdf at 0x7f295662d440>, ('/home/key/Downloads/cru10min30_tmp.nc4',), 'r', (('mmap', None), ('version', 2))]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/key/.virtualenvs/0105/lib64/python3.7/site-packages/xarray/backends/scipy_.py", line 83, in _open_scipy_netcdf
return scipy.io.netcdf_file(filename, mode=mode, mmap=mmap, version=version)
File "/home/key/.virtualenvs/0105/lib64/python3.7/site-packages/scipy/io/netcdf.py", line 284, in __init__
self._read()
File "/home/key/.virtualenvs/0105/lib64/python3.7/site-packages/scipy/io/netcdf.py", line 609, in _read
self.filename)
TypeError: Error: /home/key/Downloads/cru10min30_tmp.nc4 is not a valid NetCDF 3 file
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/key/.virtualenvs/0105/lib64/python3.7/site-packages/xarray/backends/api.py", line 535, in open_dataset
ds = maybe_decode_store(store)
File "/home/key/.virtualenvs/0105/lib64/python3.7/site-packages/xarray/backends/api.py", line 450, in maybe_decode_store
use_cftime=use_cftime,
File "/home/key/.virtualenvs/0105/lib64/python3.7/site-packages/xarray/conventions.py", line 570, in decode_cf
vars, attrs = obj.load()
File "/home/key/.virtualenvs/0105/lib64/python3.7/site-packages/xarray/backends/common.py", line 123, in load
(_decode_variable_name(k), v) for k, v in self.get_variables().items()
File "/home/key/.virtualenvs/0105/lib64/python3.7/site-packages/xarray/backends/scipy_.py", line 157, in get_variables
(k, self.open_store_variable(k, v)) for k, v in self.ds.variables.items()
File "/home/key/.virtualenvs/0105/lib64/python3.7/site-packages/xarray/backends/scipy_.py", line 146, in ds
return self._manager.acquire()
File "/home/key/.virtualenvs/0105/lib64/python3.7/site-packages/xarray/backends/file_manager.py", line 180, in acquire
file, _ = self._acquire_with_cache_info(needs_lock)
File "/home/key/.virtualenvs/0105/lib64/python3.7/site-packages/xarray/backends/file_manager.py", line 204, in _acquire_with_cache_info
file = self._opener(*self._args, **kwargs)
File "/home/key/.virtualenvs/0105/lib64/python3.7/site-packages/xarray/backends/scipy_.py", line 94, in _open_scipy_netcdf
raise TypeError(errmsg)
TypeError: Error: /home/key/Downloads/cru10min30_tmp.nc4 is not a valid NetCDF 3 file
If this is a NetCDF4 file, you may need to install the
netcdf4 library, e.g.,
$ pip install netcdf4
So I installed netcdf4, and then it worked for me from Python as well as from R - with or without
library(ncdf4)
having been executed:
> file4 <- path.expand("~/Downloads/cru10min30_tmp.nc4")
> library(reticulate)
> use_virtualenv("0105", T)
> library(ncdf4)
> xr <- import("xarray")
> dt <- xr$open_dataset(file4)
> dt
<xarray.Dataset>
Dimensions: (lat: 360, lon: 720, nv: 2, time: 12)
Coordinates:
* lat (lat) float64 -89.75 -89.25 -88.75 -88.25 ... 88.75 89.25 89.75
* lon (lon) float64 -179.8 -179.2 -178.8 -178.2 ... 178.8 179.3 179.8
* time (time) datetime64[ns] 1976-01-16T12:00:00 ... 1976-12-16T12:00:00
Dimensions without coordinates: nv
Data variables:
time_bounds (time, nv) float32 ...
tmp (time, lat, lon) float32 ...
Attributes:
data: CRU CL 2.0 1961-1990 Monthly Averages
title: CRU CL 2.0 -- 10min grid sampled every 0.5 degree
institution: http://www.cru.uea.ac.uk/
source: http://www.cru.uea.ac.uk/~markn/cru05/cru05_intro.html
references: New et al. (2002) Climate Res 21:1-25
history: Wed Jan 8 22:12:00 2020: ncks -4 -O /home/key/Downloads/cr...
Conventions: CF-1.0
NCO: netCDF Operators version 4.8.1 (Homepage = http://nco.sf.ne...
Can you try again after installing
pip install netcdf4
in Python and see if it solves the problem?
Testing on my home computer had to install everything and it works. Let me test it on my work computer (where it failed originally) after the weekend. Maybe it was a problem with the python packages that is now fixed?
Maybe! Were you able to test on the other machine already?
Yeah, unfortunately the original error is still present in my other machine. I did pip install netcdf4
in my terminal and also tried with py_install("netcdf4")
in R (got "All requested packages already installed."). :sob:
I don't know if it helps, but this is the result of py_config()
reticulate::py_config()
#> python: /home/elio/miniconda3/envs/r-reticulate/bin/python
#> libpython: /home/elio/miniconda3/envs/r-reticulate/lib/libpython3.7m.so
#> pythonhome: /home/elio/miniconda3/envs/r-reticulate:/home/elio/miniconda3/envs/r-reticulate
#> version: 3.7.3 | packaged by conda-forge | (default, Jul 1 2019, 21:52:21) [GCC 7.3.0]
#> numpy: /home/elio/miniconda3/envs/r-reticulate/lib/python3.7/site-packages/numpy
#> numpy_version: 1.17.1
#>
#> python versions found:
#> /home/elio/miniconda3/envs/r-reticulate/bin/python
#> /usr/bin/python
#> /usr/bin/python3
#> /home/elio/.virtualenvs/r-keras/bin/python
#> /home/elio/miniconda3/envs/metR/bin/python
#> /home/elio/miniconda3/bin/python
Created on 2020-01-14 by the reprex package (v0.3.0)
That's all pretty strange... idk if it helps, but could you try uninstalling netcdf
pip uninstall netcdf4
then reinstall it, and try again?
Still failing :'(
Hm :-(
Can you check that all versions - netcdf R package, xarray & netcdf python packages - are the same on both machines?
Also, can you
sudo updatedb
locate libnetcdf | xargs ldd
on both machines?
Ok, I'll send you both reports later today.
One weird thing is that pip freeze
(which google tells me lists all installed modules) didn't list xarray as installed, however py_available("xarray")
was TRUE
. Now, after using pip install xarray
I get these version numbers from pip freeze
xarray==0.11.3
netCDF4==1.5.3
However, checking from R gives me different results:
library(reticulate)
mt <- import("importlib_metadata")
mt$version("xarray")
#> [1] "0.14.1"
mt$version("netCDF4")
#> [1] "1.5.1.2"
packageVersion("ncdf4")
#> [1] '1.17'
Created on 2020-01-14 by the reprex package (v0.3.0)
So something fishy is going on with the different module installations.
PS: pip3 freeze
gives yet another combination of version numbers
xarray==0.14.1
netCDF4==1.5.3
PSS: Oh, all this is in the PC where the code fails.
That looks weird! But, can we first make sure all these refer to the same environment?
For python, on the command line: conda activate r-reticulate
For reticulate, use_condaenv("r-reticulate", required = TRUE)
(please don't omit that required = TRUE)
Just to be sure, better doublecheck with py_config()
as well...
I get conda: command not found
. I don't remember having installed conda or anaconda or whatever.
Do you mean miniconda
was installed by reticulate
, not you yourself? (This is possible starting from 1.14, see https://blog.rstudio.com/2019/12/20/reticulate-1-14/)
/home/elio/miniconda3/envs/r-reticulate/bin/python
?
Also, can you check if you have any conda
related entries in your bashrc
?
Yep, /home/elio/miniconda3/envs/r-reticulate/bin/python
runs python alright, and using importlib_metadata
inside that session I get the same versions as from R. There's no conda stuff in .bashrc
.
OK so summarizing, you have (at least) 3 environments with different versions of the libraries concerned...
Can we do a comparison test? In all cases, please make sure the "right" env is used by first restarting R and then, doing one of
use_condaenv("...", required = TRUE)
use_python("...", required = TRUE)
use_virtualenv("...", required = TRUE)
(Also please doublecheck using py_config()
.)
1) condaenv r-reticulate
which has xarray 0.14.1, netCDF4 1.5.1.2
2) whichever env that is which has
xarray 0.14.1, netCDF4 1.5.3
3) whichever env that is which has
xarray 0.11.3, netCDF4 1.5.3
Please for all three setups test if you get "that weird thing" (that it works when the R pkg is not loaded but errors if it is).
As a sidenote, my env - where stuff works fine - corresponds to setup (2) above.
Hi, did you have a chance to test this? Thanks.
Sorry, I was busy and out of the office. I'll get to it asap :)
I have these 3 environments (don't ask me why!)
reticulate::conda_list()
#> name python
#> 1 miniconda3 /home/elio/miniconda3/bin/python
#> 2 netCDF4-python /home/elio/miniconda3/envs/netCDF4-python/bin/python
#> 3 r-reticulate /home/elio/miniconda3/envs/r-reticulate/bin/python
Testing each (always restarting R between tests):
test_env <- function(env) {
library(reticulate)
use_condaenv(env, required = TRUE)
print(py_config())
file <- "~/Documents/CONICET/onda3/DATA/zg_Amon_CNRM-CM6-1_historical_i1p1f2_gr_185001-201412.nc4"
library(ncdf4)
xr <- try(import("xarray"), silent = TRUE)
if (inherits(xr, "try-error")) {
return(list(env = env,
xarray_version = NA,
netcdf4_version = NA,
success = NA))
}
dt <- try(xr$open_dataset(file), silent = TRUE)
mt <- import("importlib_metadata")
return(list(env = env,
xarray_version = mt$version("xarray"),
netcdf4_version = mt$version("netCDF4"),
success = !inherits(dt, "try-error")))
}
test_env("miniconda3")
#> python: /home/elio/miniconda3/bin/python
#> libpython: /home/elio/miniconda3/lib/libpython3.7m.so
#> pythonhome: /home/elio/miniconda3:/home/elio/miniconda3
#> version: 3.7.3 (default, Mar 27 2019, 22:11:17) [GCC 7.3.0]
#> numpy: [NOT FOUND]
#>
#> NOTE: Python version was forced by use_python function
#> $env
#> [1] "miniconda3"
#>
#> $xarray_version
#> [1] NA
#>
#> $netcdf4_version
#> [1] NA
#>
#> $success
#> [1] NA
This environment doesn't even have xarray installed.
test_env("netCDF4-python")
#> Error in use_python(conda_env_python[[1]], required = required): Specified version of python '/home/elio/miniconda3/envs/netCDF4-python/bin/python' does not exist.
This one doesn't exist?
test_env("r-reticulate")
#> python: /home/elio/miniconda3/envs/r-reticulate/bin/python
#> libpython: /home/elio/miniconda3/envs/r-reticulate/lib/libpython3.7m.so
#> pythonhome: /home/elio/miniconda3/envs/r-reticulate:/home/elio/miniconda3/envs/r-reticulate
#> version: 3.7.3 | packaged by conda-forge | (default, Jul 1 2019, 21:52:21) [GCC 7.3.0]
#> numpy: /home/elio/miniconda3/envs/r-reticulate/lib/python3.7/site-packages/numpy
#> numpy_version: 1.17.1
#>
#> NOTE: Python version was forced by use_python function
#> $env
#> [1] "r-reticulate"
#>
#> $xarray_version
#> [1] "0.14.1"
#>
#> $netcdf4_version
#> [1] "1.5.1.2"
#>
#> $success
#> [1] FALSE
And this one has the usual error.
I don't know how to get to the other environments.
Ok. I ran this
library(reticulate)
use_condaenv("r-reticulate", required = TRUE)
py_install("netcdf4", pip = TRUE)
And now the netcdf4 version in r-reticulate
is indeed 1.5.3 AND the bug seems to be gone!
file <- "~/Documents/CONICET/onda3/DATA/zg_Amon_CNRM-CM6-1_historical_i1p1f2_gr_185001-201412.nc4"
library(ncdf4)
library(reticulate)
xr <- import("xarray")
mt <- import("importlib_metadata")
mt$version("xarray")
#> [1] "0.14.1"
mt$version("netcdf4")
#> [1] "1.5.3"
py_config()
#> python: /home/elio/miniconda3/envs/r-reticulate/bin/python
#> libpython: /home/elio/miniconda3/envs/r-reticulate/lib/libpython3.7m.so
#> pythonhome: /home/elio/miniconda3/envs/r-reticulate:/home/elio/miniconda3/envs/r-reticulate
#> version: 3.7.3 | packaged by conda-forge | (default, Jul 1 2019, 21:52:21) [GCC 7.3.0]
#> numpy: /home/elio/miniconda3/envs/r-reticulate/lib/python3.7/site-packages/numpy
#> numpy_version: 1.17.1
#> xarray: /home/elio/miniconda3/envs/r-reticulate/lib/python3.7/site-packages/xarray
#>
#> python versions found:
#> /home/elio/miniconda3/envs/r-reticulate/bin/python
#> /home/elio/miniconda3/envs/r-reticulate/bin/python3
#> /usr/bin/python3
#> /usr/bin/python
#> /home/elio/.virtualenvs/r-keras/bin/python
#> /home/elio/miniconda3/bin/python
dt <- xr$
open_dataset(file)
dt
#> <xarray.Dataset>
#> Dimensions: (lat: 128, lon: 256, plev: 2, time: 660)
#> Coordinates:
#> * time (time) datetime64[ns] 1850-01-01 1850-04-01 ... 2014-10-01
#> * lat (lat) float64 -88.93 -87.54 -86.14 -84.74 ... 86.14 87.54 88.93
#> * lon (lon) float64 0.0 1.406 2.812 4.219 ... 354.4 355.8 357.2 358.6
#> * plev (plev) float32 500.0 2000.0
#> Data variables:
#> zg (time, plev, lat, lon) float32 ...
Created on 2020-01-23 by the reprex package (v0.3.0)
Thanks for testing! So perhaps it really was an issue of library versions (whyever)?
As you have a working configuration now, should we regard this as solved?
I guess we should! Thanks for the troubleshooting and sorry for all the trouble.
No problem :-)
I'm trying to use xarray with reticulate and it can't seem to find the correct dependencies. The problem seems to arise when I use the ncdf4 R package. Here's a reproducible example
Loading ncdf4 (I get the same result if I use any function with
ncdf4::...
Created on 2019-08-29 by the reprex package (v0.3.0)
If I don't load ncdf4, everything runs smoothly.
I have no idea why just loading an R package throws a monkey wrench into the python module. I don't know if this is a reticulate issue or something else.