USGS-R / river-dl

Deep learning model for predicting environmental variables on river systems
Creative Commons Zero v1.0 Universal
21 stars 14 forks source link

Update container docs and/or container #209

Open SimonTopp opened 1 year ago

SimonTopp commented 1 year ago

@jds485 pointed out to me today that the container docs/container for this repository are out of date. We set up the container awhile back and no one really switched to using it from their conda environments. I think partially because folks had different environment set-ups based on if they wanted TF or PyTorch, and also because the Snakemake workflow for training models in parallel on TG is set up around conda environments rather than singularity. Best practice would probably to have an up-to-date container (or maybe even two, one for TF and one for PyTorch?), but if we're going to update/maintain them just to let them fall to the wayside like the first one did I'm not sure it's worth the effort. Thoughts on this? @jesse-ross I'm sure you have a hot take 😉!

jesse-ross commented 1 year ago

If conda environments are working for folks, then that is probably OK for working purposes. However, it doesn't address system dependencies. I could envision different versions of something like BLAS potentially causing issues, or different versions of nvidia toolkits?

I don't really know enough about the current snakemake process to say much more than that. It seems like it would likely be possible to have a dirt-simple container which just had system-level libraries installed, and no python packages, and then let folks do whatever they want in conda on top of that. That might be a way to boost reproducibility without locking people in to specific python packages/versions in the way that the current Dockerfile does.

I'd be happy to talk about this more if you want to show me the current workflow - I'd guess it might be easiest to do that on a call.