rstudio / reticulate

R Interface to Python
https://rstudio.github.io/reticulate
Apache License 2.0
1.68k stars 328 forks source link

Issue with python module dependency #587

Closed eliocamp closed 4 years ago

eliocamp commented 5 years ago

I'm trying to use xarray with reticulate and it can't seem to find the correct dependencies. The problem seems to arise when I use the ncdf4 R package. Here's a reproducible example

Loading ncdf4 (I get the same result if I use any function with ncdf4::...

file <- "~/Documents/CONICET/onda3/DATA/zg_Amon_CNRM-CM6-1_historical_i1p1f2_gr_185001-201412.nc4"

library(ncdf4)

library(reticulate)
xr <- import("xarray")

dt <- xr$
   open_dataset(file)
#> Error in py_call_impl(callable, dots$args, dots$keywords): TypeError: Error: /home/elio/Documents/CONICET/onda3/DATA/zg_Amon_CNRM-CM6-1_historical_i1p1f2_gr_185001-201412.nc4 is not a valid NetCDF 3 file
#>             If this is a NetCDF4 file, you may need to install the
#>             netcdf4 library, e.g.,
#> 
#>             $ pip install netcdf4
#>           

Created on 2019-08-29 by the reprex package (v0.3.0)

If I don't load ncdf4, everything runs smoothly.

file <- "~/Documents/CONICET/onda3/DATA/zg_Amon_CNRM-CM6-1_historical_i1p1f2_gr_185001-201412.nc4"

# library(ncdf4)

library(reticulate)
xr <- import("xarray")

dt <- xr$
   open_dataset(file)
dt
#> <xarray.Dataset>
#> Dimensions:  (lat: 128, lon: 256, plev: 2, time: 660)
#> Coordinates:
#>   * time     (time) datetime64[ns] 1850-01-01 1850-04-01 ... 2014-10-01
#>   * lat      (lat) float64 -88.93 -87.54 -86.14 -84.74 ... 86.14 87.54 88.93
#>   * lon      (lon) float64 0.0 1.406 2.812 4.219 ... 354.4 355.8 357.2 358.6
#>   * plev     (plev) float32 500.0 2000.0
#> Data variables:
#>     zg       (time, plev, lat, lon) float32 ...

I have no idea why just loading an R package throws a monkey wrench into the python module. I don't know if this is a reticulate issue or something else.

kongdd commented 4 years ago

I have the same issue.

skeydan commented 4 years ago

Can we check this occurs with a publicly accessible file, e.g.,

http://geog.uoregon.edu/GeogR/data/raster/cru10min30_tmp.nc

?

Alos, to make sure we're all on the same page, please use the current release of netcdf4 published on CRAN on 2019-10-23.

I am running this code:

library(reticulate)
# use_virtualenv("0105", T) # just some env

library(ncdf4)
xr <- import("xarray")

file <- "~/Downloads/cru10min30_tmp.nc"
dt <- xr$open_dataset(file)
dt

without getting an error.

eliocamp commented 4 years ago

Your code works, but I think it's because your file is not a NetCDF4 file, but a NetCDF3 one.


file3 <- path.expand("~/Downloads/cru10min30_tmp.nc")
file4 <- path.expand("~/Downloads/cru10min30_tmp.nc4")
url <- "http://geog.uoregon.edu/GeogR/data/raster/cru10min30_tmp.nc"
download.file(url, destfile = file3)

# Converts file to NetCDF4 
# (Needs nco operators)
system(paste("ncks -4 -O ", file3, file4))

library(reticulate)
# use_virtualenv("0105", T) # just some env

library(ncdf4)
xr <- import("xarray")

# Opens NetCDF3 with no problem
dt3 <- xr$open_dataset(file3)
dt3
#> <xarray.Dataset>
#> Dimensions:      (lat: 360, lon: 720, nv: 2, time: 12)
#> Coordinates:
#>   * lon          (lon) float64 -179.8 -179.2 -178.8 -178.2 ... 178.8 179.3 179.8
#>   * lat          (lat) float64 -89.75 -89.25 -88.75 -88.25 ... 88.75 89.25 89.75
#>   * time         (time) datetime64[ns] 1976-01-16T12:00:00 ... 1976-12-16T12:00:00
#> Dimensions without coordinates: nv
#> Data variables:
#>     time_bounds  (time, nv) float32 ...
#>     tmp          (time, lat, lon) float32 ...
#> Attributes:
#>     data:         CRU CL 2.0 1961-1990 Monthly Averages
#>     title:        CRU CL 2.0 -- 10min grid sampled every 0.5 degree
#>     institution:  http://www.cru.uea.ac.uk/
#>     source:       http://www.cru.uea.ac.uk/~markn/cru05/cru05_intro.html
#>     references:   New et al. (2002) Climate Res 21:1-25
#>     history:      Wed Oct 29 11:27:35 2014: ncrename -v climatology_bounds,ti...
#>     Conventions:  CF-1.0

# But not NetCDF4
dt4 <- xr$open_dataset(file4)
#> Error in py_call_impl(callable, dots$args, dots$keywords): TypeError: Error: /home/elio/Downloads/cru10min30_tmp.nc4 is not a valid NetCDF 3 file
#>             If this is a NetCDF4 file, you may need to install the
#>             netcdf4 library, e.g.,
#> 
#>             $ pip install netcdf4
#>             
#> 
#> Detailed traceback: 
#>   File "/home/elio/miniconda3/envs/r-reticulate/lib/python3.7/site-packages/xarray/backends/api.py", line 451, in open_dataset
#>     ds = maybe_decode_store(store)
#>   File "/home/elio/miniconda3/envs/r-reticulate/lib/python3.7/site-packages/xarray/backends/api.py", line 381, in maybe_decode_store
#>     drop_variables=drop_variables, use_cftime=use_cftime)
#>   File "/home/elio/miniconda3/envs/r-reticulate/lib/python3.7/site-packages/xarray/conventions.py", line 517, in decode_cf
#>     vars, attrs = obj.load()
#>   File "/home/elio/miniconda3/envs/r-reticulate/lib/python3.7/site-packages/xarray/backends/common.py", line 121, in load
#>     for k, v in self.get_variables().items())
#>   File "/home/elio/miniconda3/envs/r-reticulate/lib/python3.7/site-packages/xarray/backends/scipy_.py", line 166, in get_variables
#>     for k, v in self.ds.variables.items())
#>   File "/home/elio/miniconda3/envs/r-reticulate/lib/python3.7/site-packages/xarray/backends/scipy_.py", line 158, in ds
#>     return self._manager.acquire()
#>   File "/home/elio/miniconda3/envs/r-reticulate/lib/python3.7/site-packages/xarray/backends/file_manager.py", line 168, in acquire
#>     file, _ = self._acquire_with_cache_info(needs_lock)
#>   File "/home/elio/miniconda3/envs/r-reticulate/lib/python3.7/site-packages/xarray/backends/file_manager.py", line 192, in _acquire_with_cache_info
#>     file = self._opener(*self._args, **kwargs)
#>   File "/home/elio/miniconda3/envs/r-reticulate/lib/python3.7/site-packages/xarray/backends/scipy_.py", line 100, in _open_scipy_netcdf
#>     raise TypeError(errmsg)
dt4
#> Error in eval(expr, envir, enclos): object 'dt4' not found

Created on 2020-01-08 by the reprex package (v0.3.0)

Session info ``` r devtools::session_info() #> ─ Session info ────────────────────────────────────────────────────────── #> setting value #> version R version 3.6.2 (2019-12-12) #> os elementary OS 5.1 Hera #> system x86_64, linux-gnu #> ui X11 #> language en_US #> collate en_US.UTF-8 #> ctype en_US.UTF-8 #> tz America/Argentina/Buenos_Aires #> date 2020-01-08 #> #> ─ Packages ────────────────────────────────────────────────────────────── #> package * version date lib #> assertthat 0.2.1 2019-03-21 [1] #> backports 1.1.5 2019-10-02 [1] #> callr 3.3.2 2019-09-22 [1] #> cli 2.0.0 2019-12-09 [1] #> crayon 1.3.4 2017-09-16 [1] #> desc 1.2.0 2018-05-01 [1] #> devtools 2.2.0.9000 2019-09-17 [1] #> digest 0.6.23 2019-11-23 [1] #> ellipsis 0.3.0 2019-09-20 [1] #> evaluate 0.14 2019-05-28 [1] #> fansi 0.4.0 2018-10-05 [1] #> fs 1.3.1 2019-05-06 [1] #> glue 1.3.1.9000 2019-09-17 [1] #> highr 0.8 2019-03-20 [1] #> htmltools 0.4.0 2019-10-04 [1] #> jsonlite 1.6 2018-12-07 [1] #> knitr 1.25 2019-09-18 [1] #> magrittr 1.5 2014-11-22 [1] #> memoise 1.1.0 2017-04-21 [1] #> ncdf4 * 1.17 2019-10-23 [1] #> pkgbuild 1.0.6 2019-10-09 [1] #> pkgload 1.0.2 2018-10-29 [1] #> prettyunits 1.0.2 2015-07-13 [1] #> processx 3.4.1 2019-07-18 [1] #> ps 1.3.0 2018-12-21 [1] #> R6 2.4.1 2019-11-12 [1] #> Rcpp 1.0.3 2019-11-08 [1] #> remotes 2.1.0 2019-06-24 [1] #> reticulate * 1.13.0-9000 2019-08-27 [1] #> rlang 0.4.1.9000 2019-11-12 [1] #> rmarkdown 1.16 2019-10-01 [1] #> rprojroot 1.3-2 2018-01-03 [1] #> sessioninfo 1.1.1 2018-11-05 [1] #> stringi 1.4.3 2019-03-12 [1] #> stringr 1.4.0 2019-02-10 [1] #> testthat 2.3.0 2019-11-05 [1] #> usethis 1.5.1 2019-07-04 [1] #> withr 2.1.2 2018-03-15 [1] #> xfun 0.11 2019-11-12 [1] #> yaml 2.2.0 2018-07-25 [1] #> source #> CRAN (R 3.6.0) #> CRAN (R 3.6.1) #> CRAN (R 3.6.1) #> CRAN (R 3.6.2) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> Github (r-lib/devtools@2765fbe) #> CRAN (R 3.6.1) #> CRAN (R 3.6.1) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.1) #> Github (tidyverse/glue@71eeddf) #> CRAN (R 3.6.0) #> CRAN (R 3.6.1) #> CRAN (R 3.6.0) #> CRAN (R 3.6.1) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.2) #> CRAN (R 3.6.1) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.1) #> CRAN (R 3.6.0) #> CRAN (R 3.6.1) #> CRAN (R 3.6.1) #> CRAN (R 3.6.1) #> Github (rstudio/reticulate@5e0df26) #> Github (r-lib/rlang@5a0b80a) #> CRAN (R 3.6.1) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.1) #> CRAN (R 3.6.1) #> CRAN (R 3.6.0) #> CRAN (R 3.6.1) #> CRAN (R 3.6.0) #> #> [1] /home/elio/R/x86_64-pc-linux-gnu-library/3.6 #> [2] /usr/local/lib/R/site-library #> [3] /usr/lib/R/site-library #> [4] /usr/lib/R/library ```

My guess is that xarray is not using the netcdf4 libary to read netcdf3 files, so loading {ncdf4} does not conflict with it.

And just to be sure, if {ncdf4} is not loaded, then everything runs smoothly with both NetCDF files.


file3 <- path.expand("~/Downloads/cru10min30_tmp.nc")
file4 <- path.expand("~/Downloads/cru10min30_tmp.nc4")
url <- "http://geog.uoregon.edu/GeogR/data/raster/cru10min30_tmp.nc"
download.file(url, destfile = file3)

# Converts file to NetCDF4 
# (Needs nco operators)
system(paste("ncks -4 -O ", file3, file4))

library(reticulate)
xr <- import("xarray")
dt3 <- xr$open_dataset(file3)
dt4 <- xr$open_dataset(file4)

Created on 2020-01-08 by the reprex package (v0.3.0)

Session info ``` r devtools::session_info() #> ─ Session info ────────────────────────────────────────────────────────── #> setting value #> version R version 3.6.2 (2019-12-12) #> os elementary OS 5.1 Hera #> system x86_64, linux-gnu #> ui X11 #> language en_US #> collate en_US.UTF-8 #> ctype en_US.UTF-8 #> tz America/Argentina/Buenos_Aires #> date 2020-01-08 #> #> ─ Packages ────────────────────────────────────────────────────────────── #> package * version date lib #> assertthat 0.2.1 2019-03-21 [1] #> backports 1.1.5 2019-10-02 [1] #> callr 3.3.2 2019-09-22 [1] #> cli 2.0.0 2019-12-09 [1] #> crayon 1.3.4 2017-09-16 [1] #> desc 1.2.0 2018-05-01 [1] #> devtools 2.2.0.9000 2019-09-17 [1] #> digest 0.6.23 2019-11-23 [1] #> ellipsis 0.3.0 2019-09-20 [1] #> evaluate 0.14 2019-05-28 [1] #> fansi 0.4.0 2018-10-05 [1] #> fs 1.3.1 2019-05-06 [1] #> glue 1.3.1.9000 2019-09-17 [1] #> highr 0.8 2019-03-20 [1] #> htmltools 0.4.0 2019-10-04 [1] #> jsonlite 1.6 2018-12-07 [1] #> knitr 1.25 2019-09-18 [1] #> magrittr 1.5 2014-11-22 [1] #> memoise 1.1.0 2017-04-21 [1] #> pkgbuild 1.0.6 2019-10-09 [1] #> pkgload 1.0.2 2018-10-29 [1] #> prettyunits 1.0.2 2015-07-13 [1] #> processx 3.4.1 2019-07-18 [1] #> ps 1.3.0 2018-12-21 [1] #> R6 2.4.1 2019-11-12 [1] #> Rcpp 1.0.3 2019-11-08 [1] #> remotes 2.1.0 2019-06-24 [1] #> reticulate * 1.13.0-9000 2019-08-27 [1] #> rlang 0.4.1.9000 2019-11-12 [1] #> rmarkdown 1.16 2019-10-01 [1] #> rprojroot 1.3-2 2018-01-03 [1] #> sessioninfo 1.1.1 2018-11-05 [1] #> stringi 1.4.3 2019-03-12 [1] #> stringr 1.4.0 2019-02-10 [1] #> testthat 2.3.0 2019-11-05 [1] #> usethis 1.5.1 2019-07-04 [1] #> withr 2.1.2 2018-03-15 [1] #> xfun 0.11 2019-11-12 [1] #> yaml 2.2.0 2018-07-25 [1] #> source #> CRAN (R 3.6.0) #> CRAN (R 3.6.1) #> CRAN (R 3.6.1) #> CRAN (R 3.6.2) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> Github (r-lib/devtools@2765fbe) #> CRAN (R 3.6.1) #> CRAN (R 3.6.1) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.1) #> Github (tidyverse/glue@71eeddf) #> CRAN (R 3.6.0) #> CRAN (R 3.6.1) #> CRAN (R 3.6.0) #> CRAN (R 3.6.1) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.1) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.1) #> CRAN (R 3.6.0) #> CRAN (R 3.6.1) #> CRAN (R 3.6.1) #> CRAN (R 3.6.1) #> Github (rstudio/reticulate@5e0df26) #> Github (r-lib/rlang@5a0b80a) #> CRAN (R 3.6.1) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.0) #> CRAN (R 3.6.1) #> CRAN (R 3.6.1) #> CRAN (R 3.6.0) #> CRAN (R 3.6.1) #> CRAN (R 3.6.0) #> #> [1] /home/elio/R/x86_64-pc-linux-gnu-library/3.6 #> [2] /usr/local/lib/R/site-library #> [3] /usr/lib/R/site-library #> [4] /usr/lib/R/library ```
skeydan commented 4 years ago

Thanks, I can reproduce! I'll keep you posted :-)

skeydan commented 4 years ago

Actually - sorry, I answered too fast :-) I get that same error even with ncdf4 NOT loaded...

eliocamp commented 4 years ago

Wait, are you getting the error when using the code of the last example even in a new session? :sob:

skeydan commented 4 years ago

Yeah, exactly. And when I leave R out of the equation completely:

Python 3.7.5 (default, Dec 15 2019, 17:54:26) 
[GCC 9.2.1 20190827 (Red Hat 9.2.1-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import xarray as xarr
>>> xarr.open_data("cru10min30_tmp.nc4")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: module 'xarray' has no attribute 'open_data'
>>> xarr.open_dataset("cru10min30_tmp.nc4")
Traceback (most recent call last):
  File "/home/key/.virtualenvs/0105/lib64/python3.7/site-packages/xarray/backends/file_manager.py", line 198, in _acquire_with_cache_info
    file = self._cache[self._key]
  File "/home/key/.virtualenvs/0105/lib64/python3.7/site-packages/xarray/backends/lru_cache.py", line 53, in __getitem__
    value = self._cache[key]
KeyError: [<function _open_scipy_netcdf at 0x7f295662d440>, ('/home/key/Downloads/cru10min30_tmp.nc4',), 'r', (('mmap', None), ('version', 2))]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/key/.virtualenvs/0105/lib64/python3.7/site-packages/xarray/backends/scipy_.py", line 83, in _open_scipy_netcdf
    return scipy.io.netcdf_file(filename, mode=mode, mmap=mmap, version=version)
  File "/home/key/.virtualenvs/0105/lib64/python3.7/site-packages/scipy/io/netcdf.py", line 284, in __init__
    self._read()
  File "/home/key/.virtualenvs/0105/lib64/python3.7/site-packages/scipy/io/netcdf.py", line 609, in _read
    self.filename)
TypeError: Error: /home/key/Downloads/cru10min30_tmp.nc4 is not a valid NetCDF 3 file

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/key/.virtualenvs/0105/lib64/python3.7/site-packages/xarray/backends/api.py", line 535, in open_dataset
    ds = maybe_decode_store(store)
  File "/home/key/.virtualenvs/0105/lib64/python3.7/site-packages/xarray/backends/api.py", line 450, in maybe_decode_store
    use_cftime=use_cftime,
  File "/home/key/.virtualenvs/0105/lib64/python3.7/site-packages/xarray/conventions.py", line 570, in decode_cf
    vars, attrs = obj.load()
  File "/home/key/.virtualenvs/0105/lib64/python3.7/site-packages/xarray/backends/common.py", line 123, in load
    (_decode_variable_name(k), v) for k, v in self.get_variables().items()
  File "/home/key/.virtualenvs/0105/lib64/python3.7/site-packages/xarray/backends/scipy_.py", line 157, in get_variables
    (k, self.open_store_variable(k, v)) for k, v in self.ds.variables.items()
  File "/home/key/.virtualenvs/0105/lib64/python3.7/site-packages/xarray/backends/scipy_.py", line 146, in ds
    return self._manager.acquire()
  File "/home/key/.virtualenvs/0105/lib64/python3.7/site-packages/xarray/backends/file_manager.py", line 180, in acquire
    file, _ = self._acquire_with_cache_info(needs_lock)
  File "/home/key/.virtualenvs/0105/lib64/python3.7/site-packages/xarray/backends/file_manager.py", line 204, in _acquire_with_cache_info
    file = self._opener(*self._args, **kwargs)
  File "/home/key/.virtualenvs/0105/lib64/python3.7/site-packages/xarray/backends/scipy_.py", line 94, in _open_scipy_netcdf
    raise TypeError(errmsg)
TypeError: Error: /home/key/Downloads/cru10min30_tmp.nc4 is not a valid NetCDF 3 file
            If this is a NetCDF4 file, you may need to install the
            netcdf4 library, e.g.,

            $ pip install netcdf4

So I installed netcdf4, and then it worked for me from Python as well as from R - with or without

library(ncdf4)

having been executed:

> file4 <- path.expand("~/Downloads/cru10min30_tmp.nc4")
> library(reticulate)
> use_virtualenv("0105", T)
> library(ncdf4)
> xr <- import("xarray")
> dt <- xr$open_dataset(file4)
> dt
<xarray.Dataset>
Dimensions:      (lat: 360, lon: 720, nv: 2, time: 12)
Coordinates:
  * lat          (lat) float64 -89.75 -89.25 -88.75 -88.25 ... 88.75 89.25 89.75
  * lon          (lon) float64 -179.8 -179.2 -178.8 -178.2 ... 178.8 179.3 179.8
  * time         (time) datetime64[ns] 1976-01-16T12:00:00 ... 1976-12-16T12:00:00
Dimensions without coordinates: nv
Data variables:
    time_bounds  (time, nv) float32 ...
    tmp          (time, lat, lon) float32 ...
Attributes:
    data:         CRU CL 2.0 1961-1990 Monthly Averages
    title:        CRU CL 2.0 -- 10min grid sampled every 0.5 degree
    institution:  http://www.cru.uea.ac.uk/
    source:       http://www.cru.uea.ac.uk/~markn/cru05/cru05_intro.html
    references:   New et al. (2002) Climate Res 21:1-25
    history:      Wed Jan  8 22:12:00 2020: ncks -4 -O /home/key/Downloads/cr...
    Conventions:  CF-1.0
    NCO:          netCDF Operators version 4.8.1 (Homepage = http://nco.sf.ne...

Can you try again after installing

pip install netcdf4

in Python and see if it solves the problem?

eliocamp commented 4 years ago

Testing on my home computer had to install everything and it works. Let me test it on my work computer (where it failed originally) after the weekend. Maybe it was a problem with the python packages that is now fixed?

skeydan commented 4 years ago

Maybe! Were you able to test on the other machine already?

eliocamp commented 4 years ago

Yeah, unfortunately the original error is still present in my other machine. I did pip install netcdf4 in my terminal and also tried with py_install("netcdf4") in R (got "All requested packages already installed."). :sob:

I don't know if it helps, but this is the result of py_config()

reticulate::py_config()
#> python:         /home/elio/miniconda3/envs/r-reticulate/bin/python
#> libpython:      /home/elio/miniconda3/envs/r-reticulate/lib/libpython3.7m.so
#> pythonhome:     /home/elio/miniconda3/envs/r-reticulate:/home/elio/miniconda3/envs/r-reticulate
#> version:        3.7.3 | packaged by conda-forge | (default, Jul  1 2019, 21:52:21)  [GCC 7.3.0]
#> numpy:          /home/elio/miniconda3/envs/r-reticulate/lib/python3.7/site-packages/numpy
#> numpy_version:  1.17.1
#> 
#> python versions found: 
#>  /home/elio/miniconda3/envs/r-reticulate/bin/python
#>  /usr/bin/python
#>  /usr/bin/python3
#>  /home/elio/.virtualenvs/r-keras/bin/python
#>  /home/elio/miniconda3/envs/metR/bin/python
#>  /home/elio/miniconda3/bin/python

Created on 2020-01-14 by the reprex package (v0.3.0)

skeydan commented 4 years ago

That's all pretty strange... idk if it helps, but could you try uninstalling netcdf

pip uninstall netcdf4

then reinstall it, and try again?

eliocamp commented 4 years ago

Still failing :'(

skeydan commented 4 years ago

Hm :-(

Can you check that all versions - netcdf R package, xarray & netcdf python packages - are the same on both machines?

Also, can you

sudo updatedb
locate libnetcdf | xargs ldd

on both machines?

eliocamp commented 4 years ago

Ok, I'll send you both reports later today.

One weird thing is that pip freeze (which google tells me lists all installed modules) didn't list xarray as installed, however py_available("xarray") was TRUE. Now, after using pip install xarray I get these version numbers from pip freeze

xarray==0.11.3
netCDF4==1.5.3

However, checking from R gives me different results:

library(reticulate)
mt <- import("importlib_metadata")
mt$version("xarray")
#> [1] "0.14.1"
mt$version("netCDF4")
#> [1] "1.5.1.2"
packageVersion("ncdf4")
#> [1] '1.17'

Created on 2020-01-14 by the reprex package (v0.3.0)

Session info ``` r devtools::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 3.6.2 (2019-12-12) #> os elementary OS 5.1 Hera #> system x86_64, linux-gnu #> ui X11 #> language en_US #> collate en_US.UTF-8 #> ctype en_US.UTF-8 #> tz America/Argentina/Buenos_Aires #> date 2020-01-14 #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date lib source #> assertthat 0.2.1 2019-03-21 [1] CRAN (R 3.6.0) #> backports 1.1.5 2019-10-02 [1] CRAN (R 3.6.1) #> callr 3.4.0 2019-12-09 [1] CRAN (R 3.6.2) #> cli 2.0.1 2020-01-08 [1] CRAN (R 3.6.2) #> crayon 1.3.4 2017-09-16 [1] CRAN (R 3.6.0) #> desc 1.2.0 2018-05-01 [1] CRAN (R 3.6.0) #> devtools 2.2.2 2020-01-14 [1] Github (r-lib/devtools@a9bd18c) #> digest 0.6.23 2019-11-23 [1] CRAN (R 3.6.1) #> ellipsis 0.3.0 2019-09-20 [1] CRAN (R 3.6.1) #> evaluate 0.14 2019-05-28 [1] CRAN (R 3.6.0) #> fansi 0.4.1 2020-01-08 [1] CRAN (R 3.6.2) #> fs 1.3.1 2019-05-06 [1] CRAN (R 3.6.1) #> glue 1.3.1.9000 2020-01-14 [1] Github (tidyverse/glue@b9ffe6c) #> highr 0.8 2019-03-20 [1] CRAN (R 3.6.0) #> htmltools 0.4.0 2019-10-04 [1] CRAN (R 3.6.1) #> jsonlite 1.6 2018-12-07 [1] CRAN (R 3.6.0) #> knitr 1.26 2019-11-12 [1] CRAN (R 3.6.2) #> magrittr 1.5 2014-11-22 [1] CRAN (R 3.6.0) #> memoise 1.1.0 2017-04-21 [1] CRAN (R 3.6.0) #> pkgbuild 1.0.6 2019-10-09 [1] CRAN (R 3.6.1) #> pkgload 1.0.2 2018-10-29 [1] CRAN (R 3.6.0) #> prettyunits 1.1.0 2020-01-09 [1] CRAN (R 3.6.2) #> processx 3.4.1 2019-07-18 [1] CRAN (R 3.6.1) #> ps 1.3.0 2018-12-21 [1] CRAN (R 3.6.0) #> R6 2.4.1 2019-11-12 [1] CRAN (R 3.6.1) #> rappdirs 0.3.1 2016-03-28 [1] CRAN (R 3.6.0) #> Rcpp 1.0.3 2019-11-08 [1] CRAN (R 3.6.1) #> remotes 2.1.0 2019-06-24 [1] CRAN (R 3.6.1) #> reticulate * 1.14 2019-12-17 [1] CRAN (R 3.6.2) #> rlang 0.4.2 2019-11-23 [1] CRAN (R 3.6.2) #> rmarkdown 2.0 2019-12-12 [1] CRAN (R 3.6.2) #> rprojroot 1.3-2 2018-01-03 [1] CRAN (R 3.6.0) #> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 3.6.0) #> stringi 1.4.5 2020-01-11 [1] CRAN (R 3.6.2) #> stringr 1.4.0 2019-02-10 [1] CRAN (R 3.6.0) #> testthat 2.3.1 2019-12-01 [1] CRAN (R 3.6.2) #> usethis 1.5.1 2019-07-04 [1] CRAN (R 3.6.1) #> withr 2.1.2 2018-03-15 [1] CRAN (R 3.6.0) #> xfun 0.12 2020-01-13 [1] CRAN (R 3.6.2) #> yaml 2.2.0 2018-07-25 [1] CRAN (R 3.6.0) #> #> [1] /home/elio/R/x86_64-pc-linux-gnu-library/3.6 #> [2] /usr/local/lib/R/site-library #> [3] /usr/lib/R/site-library #> [4] /usr/lib/R/library ```

So something fishy is going on with the different module installations.

PS: pip3 freeze gives yet another combination of version numbers

xarray==0.14.1
netCDF4==1.5.3

PSS: Oh, all this is in the PC where the code fails.

skeydan commented 4 years ago

That looks weird! But, can we first make sure all these refer to the same environment?

For python, on the command line: conda activate r-reticulate

For reticulate, use_condaenv("r-reticulate", required = TRUE) (please don't omit that required = TRUE)

Just to be sure, better doublecheck with py_config() as well...

eliocamp commented 4 years ago

I get conda: command not found. I don't remember having installed conda or anaconda or whatever.

skeydan commented 4 years ago

Do you mean miniconda was installed by reticulate, not you yourself? (This is possible starting from 1.14, see https://blog.rstudio.com/2019/12/20/reticulate-1-14/)

/home/elio/miniconda3/envs/r-reticulate/bin/python

?

skeydan commented 4 years ago

Also, can you check if you have any conda related entries in your bashrc?

eliocamp commented 4 years ago

Yep, /home/elio/miniconda3/envs/r-reticulate/bin/python runs python alright, and using importlib_metadata inside that session I get the same versions as from R. There's no conda stuff in .bashrc.

skeydan commented 4 years ago

OK so summarizing, you have (at least) 3 environments with different versions of the libraries concerned...

Can we do a comparison test? In all cases, please make sure the "right" env is used by first restarting R and then, doing one of

use_condaenv("...", required = TRUE)
use_python("...", required = TRUE)
use_virtualenv("...", required = TRUE)

(Also please doublecheck using py_config().)

1) condaenv r-reticulate

which has xarray 0.14.1, netCDF4 1.5.1.2

2) whichever env that is which has

xarray 0.14.1, netCDF4 1.5.3

3) whichever env that is which has

xarray 0.11.3, netCDF4 1.5.3

Please for all three setups test if you get "that weird thing" (that it works when the R pkg is not loaded but errors if it is).

As a sidenote, my env - where stuff works fine - corresponds to setup (2) above.

skeydan commented 4 years ago

Hi, did you have a chance to test this? Thanks.

eliocamp commented 4 years ago

Sorry, I was busy and out of the office. I'll get to it asap :)

eliocamp commented 4 years ago

I have these 3 environments (don't ask me why!)

reticulate::conda_list()
#>             name                                               python
#> 1     miniconda3                     /home/elio/miniconda3/bin/python
#> 2 netCDF4-python /home/elio/miniconda3/envs/netCDF4-python/bin/python
#> 3   r-reticulate   /home/elio/miniconda3/envs/r-reticulate/bin/python

Testing each (always restarting R between tests):

test_env <- function(env) {
  library(reticulate)
  use_condaenv(env, required = TRUE)
  print(py_config())

  file <- "~/Documents/CONICET/onda3/DATA/zg_Amon_CNRM-CM6-1_historical_i1p1f2_gr_185001-201412.nc4"
  library(ncdf4)

  xr <- try(import("xarray"), silent = TRUE)
  if (inherits(xr, "try-error")) {
    return(list(env = env, 
                xarray_version =  NA,
                netcdf4_version = NA,
                success = NA))
  }

  dt <- try(xr$open_dataset(file), silent = TRUE)

  mt <- import("importlib_metadata")

  return(list(env = env, 
              xarray_version =  mt$version("xarray"),
              netcdf4_version = mt$version("netCDF4"),
              success = !inherits(dt, "try-error")))
}

test_env("miniconda3")
#> python:         /home/elio/miniconda3/bin/python
#> libpython:      /home/elio/miniconda3/lib/libpython3.7m.so
#> pythonhome:     /home/elio/miniconda3:/home/elio/miniconda3
#> version:        3.7.3 (default, Mar 27 2019, 22:11:17)  [GCC 7.3.0]
#> numpy:           [NOT FOUND]
#> 
#> NOTE: Python version was forced by use_python function
#> $env
#> [1] "miniconda3"
#> 
#> $xarray_version
#> [1] NA
#> 
#> $netcdf4_version
#> [1] NA
#> 
#> $success
#> [1] NA

This environment doesn't even have xarray installed.

test_env("netCDF4-python")
#> Error in use_python(conda_env_python[[1]], required = required): Specified version of python '/home/elio/miniconda3/envs/netCDF4-python/bin/python' does not exist.

This one doesn't exist?

test_env("r-reticulate")
#> python:         /home/elio/miniconda3/envs/r-reticulate/bin/python
#> libpython:      /home/elio/miniconda3/envs/r-reticulate/lib/libpython3.7m.so
#> pythonhome:     /home/elio/miniconda3/envs/r-reticulate:/home/elio/miniconda3/envs/r-reticulate
#> version:        3.7.3 | packaged by conda-forge | (default, Jul  1 2019, 21:52:21)  [GCC 7.3.0]
#> numpy:          /home/elio/miniconda3/envs/r-reticulate/lib/python3.7/site-packages/numpy
#> numpy_version:  1.17.1
#> 
#> NOTE: Python version was forced by use_python function
#> $env
#> [1] "r-reticulate"
#> 
#> $xarray_version
#> [1] "0.14.1"
#> 
#> $netcdf4_version
#> [1] "1.5.1.2"
#> 
#> $success
#> [1] FALSE

And this one has the usual error.

I don't know how to get to the other environments.

eliocamp commented 4 years ago

Ok. I ran this

library(reticulate)
use_condaenv("r-reticulate", required = TRUE)
py_install("netcdf4", pip = TRUE)

And now the netcdf4 version in r-reticulate is indeed 1.5.3 AND the bug seems to be gone!

file <- "~/Documents/CONICET/onda3/DATA/zg_Amon_CNRM-CM6-1_historical_i1p1f2_gr_185001-201412.nc4"

library(ncdf4)

library(reticulate)
xr <- import("xarray")
mt <- import("importlib_metadata")
mt$version("xarray")
#> [1] "0.14.1"
mt$version("netcdf4")
#> [1] "1.5.3"
py_config()
#> python:         /home/elio/miniconda3/envs/r-reticulate/bin/python
#> libpython:      /home/elio/miniconda3/envs/r-reticulate/lib/libpython3.7m.so
#> pythonhome:     /home/elio/miniconda3/envs/r-reticulate:/home/elio/miniconda3/envs/r-reticulate
#> version:        3.7.3 | packaged by conda-forge | (default, Jul  1 2019, 21:52:21)  [GCC 7.3.0]
#> numpy:          /home/elio/miniconda3/envs/r-reticulate/lib/python3.7/site-packages/numpy
#> numpy_version:  1.17.1
#> xarray:         /home/elio/miniconda3/envs/r-reticulate/lib/python3.7/site-packages/xarray
#> 
#> python versions found: 
#>  /home/elio/miniconda3/envs/r-reticulate/bin/python
#>  /home/elio/miniconda3/envs/r-reticulate/bin/python3
#>  /usr/bin/python3
#>  /usr/bin/python
#>  /home/elio/.virtualenvs/r-keras/bin/python
#>  /home/elio/miniconda3/bin/python

dt <- xr$
  open_dataset(file)
dt
#> <xarray.Dataset>
#> Dimensions:  (lat: 128, lon: 256, plev: 2, time: 660)
#> Coordinates:
#>   * time     (time) datetime64[ns] 1850-01-01 1850-04-01 ... 2014-10-01
#>   * lat      (lat) float64 -88.93 -87.54 -86.14 -84.74 ... 86.14 87.54 88.93
#>   * lon      (lon) float64 0.0 1.406 2.812 4.219 ... 354.4 355.8 357.2 358.6
#>   * plev     (plev) float32 500.0 2000.0
#> Data variables:
#>     zg       (time, plev, lat, lon) float32 ...

Created on 2020-01-23 by the reprex package (v0.3.0)

skeydan commented 4 years ago

Thanks for testing! So perhaps it really was an issue of library versions (whyever)?

As you have a working configuration now, should we regard this as solved?

eliocamp commented 4 years ago

I guess we should! Thanks for the troubleshooting and sorry for all the trouble.

skeydan commented 4 years ago

No problem :-)