eeholmes / earthdata-cloud-cookbook

A tutorial book of workflows for research using NASA EarthData in the Cloud created by the NASA-Openscapes team
https://nasa-openscapes.github.io/earthdata-cloud-cookbook
Other
1 stars 0 forks source link

problem with raster, terra etc when using the devcontainer base image #4

Open eeholmes opened 5 months ago

eeholmes commented 5 months ago

@cboettig

In an image with

FROM ghcr.io/rocker-org/devcontainer/tidyverse:4.3

as the base image. We were unable to load spatial packages (installed via install.r). https://github.com/nmfs-opensci/container-images/blob/main/images/coastwatch/Dockerfile https://github.com/nmfs-opensci/container-images/blob/main/images/coastwatch/install.R

error

> library(raster)
Error: package or namespace load failed for ‘raster’ in dyn.load(file, DLLpath = DLLpath, ...):
 unable to load shared object '/usr/local/lib/R/site-library/terra/libs/terra.so':
  libproj.so.22: cannot open shared object file: No such file or directory

The file that it cannot open changes. terra.so and ncdf4.so were common.

fix

Switched to

FROM rocker/geospatial:4.2

to get it to work. Note install.R has rgdal which is deprecated. Maybe that is the issue? Need to test wo that.

to test

Use ghcr.io/rocker-org/devcontainer/tidyverse:4.3 in 'bring your image' and then install some spatial packages and see if problem persists. Or slowly add packages to install.R and see when the problem appears.

eeholmes commented 5 months ago

Ok just using FROM rocker/geospatial:4.2 didn't work. Probably ghcr.io/rocker-org/devcontainer/tidyverse:4.3 has some set-up we need to run the RStudio server. Got this error when I clicked the RStudio button. image

cboettig commented 5 months ago

@eeholmes yup all correct.

RUN /rocker_scripts/install_geospatial.sh

Or optionally install bleeding edge GDAL libraries using

RUN /rocker_scripts/experimental/install_dev_osgeo.sh

as in https://github.com/eeholmes/earthdata-cloud-cookbook/blob/main/.devcontainer/venv.Dockerfile#L8 . The former option is much "safer" -- the latter option means R packages that bind GDAL (terra, sf, gdalcubes) must be built from source, and it's easy to accidentally install/upgrade one of those as binaries on the dev_osgeo flavor, so probably better to stick with install_geospatial.sh whenever possible.

Re Codespaces -- yup, I think that's right. We can probably figure out a similar 'opt-in' script to make it easier to add the stuff that is in the devcontainer flavor on an arbitrary image. Just need to take a closer look at what's needed there.

eeholmes commented 5 months ago

We installed raster, terra, etc in install.R but we still got the error. Was that due to the GDAL binding issue?

eeholmes commented 5 months ago
list.of.packages <- c("ncdf4", "httr","plyr","lubridate", "parsedate", "rerddap","plotdap",
                      "rerddapXtracto", "maps", "mapdata","RColorBrewer",
                      "ggplot2","scales","dplyr","Rcurl","raster","colorRamps",
                      "parsedate","sp","sf","reshape2","jsonlite",
                      "gridGraphics","PBSmapping","date","viridis",
                      "openair","cmocean", "terra")
install.packages(list.of.packages)
eeholmes commented 5 months ago

Also, arg, the install.packages used the latest versions in posit packagemanager rather than 'seeing that we had an older version of R when we used rocker/geospatial:4.2.2

cboettig commented 5 months ago

right, none of the R packages (sf, terra, gdalcubes) that bind the system OSGeo C++ libraries (the GDAL, PROJ, and GEOS libraries) can be successfully installed without those system libraries being installed either from apt get (e.g. using /rocker_scripts/install_geospatial.sh) or from source (with the dev scripts).

Also, install2.r cannot install packages that are no longer on CRAN, like rgdal. However, this won't throw a hard error that breaks the build unless you tell it to do so (using the --error flag).

cboettig commented 5 months ago

Also, arg, the install.packages used the latest versions in posit packagemanager rather than 'seeing that we had an older version of R when we used rocker/geospatial:4.2.2

whenever you use the latest tag of a rocker image, installs default to pulling the latest version of packages. If you use a previous release, (e.g. like rocker/geospatial:4.3.2), versions will be locked to the versions that was on CRAN on the latest day that release was still current (i.e. 2024-02-29). Of course it's possible to request specific versions independently of this convention, e.g. with renv or install_version, but date-pinning is probably the simplest and also closest to the CRAN-view of a self-consistent environment.

eeholmes commented 5 months ago

We are not using the latest tag. We are using rocker/geospatial:4.2 (because we are using the Openscapes image at the moment)

That uses https://p3m.dev/cran/__linux__/jammy/2023-04-20

But I can see from the GitHub Actions log that when it gets to install.R, it uses /jammy/2024-03-21 so "current" date. So for the time being I spec the repo in install.R. Yes, I know a bunch of these already are in rocker/geospatial. just didn't want to go digging to figure out what packages are in rocker/geospatial.

repo <- "https://p3m.dev/cran/__linux__/jammy/2023-04-20"
list.of.packages <- c("ncdf4", "httr", "plyr", "lubridate", "parsedate", "rerddap",
                      "maps", "mapdata", "RColorBrewer",
                      "ggplot2","scales", "dplyr", "RCurl", "raster", "colorRamps",
                      "parsedate", "sp", "sf", "reshape2", "jsonlite",
                      "gridGraphics", "PBSmapping", "date", "viridis",
                      "openair","cmocean", "terra",
                      "plotdap", "rerddapXtracto", "rgdal")
install.packages(list.of.packages, repos=repo)
eeholmes commented 5 months ago

Also thanks for the info re how to properly add geospatial packages! I am learning as I go with the rocker images. I will fix the image to work with FROM ghcr.io/rocker-org/devcontainer/tidyverse:4.3 later but right now we have a workshop in 4 hours so need to get something stable working. The idea with using devcontainer here is that I want to add a Codespaces (and binder) button.

cboettig commented 5 months ago

re the version issue -- ah weird. yup, setting the repo to your preferred snapshot date is great. Still weird that it got latest images without that, can you link me to the logs? I don' t see any R packages being pulled in https://github.com/nmfs-opensci/container-images/actions/runs/8382560026/job/22956537780

eeholmes commented 5 months ago

It is here: https://github.com/nmfs-opensci/container-images/actions/runs/8381708205/job/22953871681

But if I had used rocker/geospatial:4.2 as the base, it will grab the right repo (same as that in geospatial:4.2) but instead I am using openscapes/rocker which is based on rocker/geospatial:4.2. In this case, it is not recognizing the repo used in the base image.

Obviously this is user error, but I am not sure what I need to do to make install.R use the right repo besides hard coding in the repo. What I want is for it to use the env variable CRAN...