CoffeaTeam / docker-coffea-base

Base Docker image for Coffea
BSD 3-Clause "New" or "Revised" License
2 stars 8 forks source link

Create environment from `environment.yml` #73

Open matthewfeickert opened 2 years ago

matthewfeickert commented 2 years ago

To build ontop of these images in an easier manner, it would be quite useful if the images were constructed from an environment.yml file that was copied into the image and then built from. So replacing

https://github.com/CoffeaTeam/docker-coffea-base/blob/a879ea1cf4c00b561b1969c489f0809552ae479e/base/Dockerfile#L11

with

RUN mamba install --yes --file /environment.yml

This would also make it easier to be able to create a lock file for the environment with conda-lock and then to build on top of these images by being able to extend the environment.yml and conda-lock files as needed.

Is this in scope?

oshadura commented 2 years ago

@lgray do you have any preferences?

(for reference Matthew is working on the generic IRIS-HEP Analysis System docker image and we had an idea to use docker-coffea-base as a baseline image for it)

nsmith- commented 1 year ago

I am trying this now as part of making a coffea2023 image, and of course there is an issue in that any conda-installable pytorch conflicts with nomkl and if we pip-install it then we cannot at the same time pip-install torch-scatter, etc. since their setup seems to require the ability to import torch. Perhaps we just take the hit and start using Intel MKL?

matthewfeickert commented 1 year ago

I am trying this now as part of making a coffea2023 image, and of course there is an issue in that any conda-installable pytorch conflicts with nomkl and if we pip-install it then we cannot at the same time pip-install torch-scatter, etc. since their setup seems to require the ability to import torch. Perhaps we just take the hit and start using Intel MKL?

@nsmith- I don't know the specifics of why you do/don't want PyTorch with nomkl, but if you haven't already I would make an Issue with torch-scatter as requiring torch to be available at install time and not using a PEP 518 build-system to provide it at build time should be considered a bad bug these days. (edit: c.f. https://github.com/rusty1s/pytorch_scatter/issues/265)

nsmith- commented 1 year ago

The original impetus was to reduce the image size, but now that our image has ballooned to 1.4GB, saving the 200MB for MKL seems less important. In https://github.com/CoffeaTeam/docker-coffea-base/blob/coffea2023/base/environment.yml I managed to build a first image and manually uploaded it to dockerhub.

nsmith- commented 9 months ago

FYI we're consolidating the images/environments in https://github.com/CoffeaTeam/af-images/pull/1