geocompx / docker

Dockerfiles for Geocomputation
https://github.com/geocompx/docker/pkgs/container/docker
GNU General Public License v3.0
35 stars 5 forks source link

Create r-py and py-r images (R-based images with Python and Julia, and Python-based images with R) #53

Open Robinlovelace opened 2 days ago

Robinlovelace commented 2 days ago

Planning to use the install_python.sh and install_julia.sh scripts from rocker: https://github.com/rocker-org/rocker-versioned2/tree/master/scripts

martinfleis commented 2 days ago

Also see this https://github.com/darribas/gds_env which has both Python and R spatial ecosystems.

Robinlovelace commented 2 days ago

Looks good but thinking here is to have R-based image with Py (and Julia) and Py-based with R, building on the well-maintained rocker project. On the topic of maintenance, just opened an issue, was trying to run the example but couldn't find the flavour descriptions: https://github.com/darribas/gds_env/issues/90

Robinlovelace commented 2 days ago

Heads-up @evetion https://github.com/rocker-org/rocker-versioned2/blob/master/scripts/install_julia.sh @rafaqz

martinfleis commented 1 day ago

The gds_env stacks are described at https://darribas.org/gds_env/stacks/. The great thing about that container is that it first pulls Python stack from conda-forge and then builds R stack from source against the same versions of GEOS, GDAL and PROJ. If you pull R stuff from CRAN and Python stuff from PyPI or elsewhere, you will likely end up with different versions of these dependencies.

Robinlovelace commented 1 day ago

:+1: to sharing deps where possible. Do you have info on gds image sizes?

martinfleis commented 1 day ago

gds_py is 1.14 GB when compressed, gds that also includes R is 3 GB.

Robinlovelace commented 1 day ago

gds_py is 1.14 GB when compressed, gds that also includes R is 3 GB.

micromamba image here is 300 MB. I'm currently exploring pixi to install all deps efficiently #54

mdsumner commented 15 hours ago

The gds_env stacks are described at https://darribas.org/gds_env/stacks/. The great thing about that container is that it first pulls Python stack from conda-forge and then builds R stack from source against the same versions of GEOS, GDAL and PROJ. If you pull R stuff from CRAN and Python stuff from PyPI or elsewhere, you will likely end up with different versions of these dependencies.

oh thank goodness, this is the first time I've heard this desired by anyone else. I have a docker image for R and Python aligned to daily build of GDAL, but my fu is not excellent and running into problems a few months later.

Robinlovelace commented 10 hours ago

Not quite as done as I would have liked.

Daily GDAL is next level @mdsumner could you share examples? See #40 for lots of other examples.

Robinlovelace commented 10 hours ago

Found them: https://github.com/mdsumner/gdal-builds

mdsumner commented 9 hours ago

my crufty dockerfiles basically take from rocker and from the ci builds done by GDAL itself, I wanted "layering" but the fact is I use my monolithic R and Python image every day now (there's an issue with numpy and sometimes pyproj but nothing stopping me from working). Also my python should use environments, but it's just another thing to learn as ever.

note that @cboettig pursued getting the GDAL builder images published here, but is not considered desirable, of course anyone can just go and do that, but while I've played around with multi-stage builds I'm certainly not very adept yet.

https://github.com/OSGeo/gdal/issues/9824

The things I wanted that aren't provide by the otherwise excellent rocker is:

With all the python packages I just had to pick through the right order to avoid anything bringing its own proj, geos, gdal. and I drew the line at compiling NetCDF/HDF5/HDF4 but consistency there isn't so important to me anyway.

But, I can't see how to do all that with and also keep rocker cleanly layered as it is now. But, I was advised during Posit conf that "why not take the higher level rocker, and then just clobber the libs and installs you want on that". I haven't tried that yet, I thought it would be "bad practice", but apparently not.

Robinlovelace commented 8 hours ago

Not to mention the GeoJulia stack although that's different because according to @evetion Julia packages will never link to system versions, each package downloading its own binaries apparently, which is good for reproducibility but perhaps not so good for image sizes and the ability to quickly test different versions of GDAL etc. Your list looks good, I agree Rocker is rock solid so will look to continue to build on that (continuing experiments with pixi and more), and keep open eyes on GDAL-based builds.

mdsumner commented 8 hours ago

fwiw, I probably don't need bleeding edge GDAL every day, and I see Pangeo docker is now at GDAL 3.9.0 which is excellent (and a bit surprising). I should probably tone it down and align to the latest release, if I need really latest GDAL it's not hard to build and I do that anyway because of pending PRs (it's a ten minute turnaround at most to check out one of Even Rouault's PRs and test a fix, as I found yesterday).

Just, a long-winded way of saying it's time I reviewed how I tap in here.

Robinlovelace commented 8 hours ago

Conda seems to have 3.9.2 as per reprex below:

docker run -it ghcr.io/geocompx/docker:pixi-r bash
root@5cac6168a483:/# R

R version 4.4.1 (2024-06-14) -- "Race for Your Life"
Copyright (C) 2024 The R Foundation for Statistical Computing
Platform: x86_64-conda-linux-gnu

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(sf)
Linking to GEOS 3.12.2, GDAL 3.9.2, PROJ 9.5.0; sf_use_s2() is TRUE
Robinlovelace commented 5 hours ago

See discussion here: https://github.com/prefix-dev/pixi/discussions/2088#discussioncomment-10702317