subugoe / metar

Documentation and suggested best practices for data analysis at WAG
https://subugoe.github.io/metaR
MIT License
1 stars 0 forks source link

abstract away and document parallel processing on gwdg #54

Open maxheld83 opened 4 years ago

maxheld83 commented 4 years ago

almost forgot to log this here as a result of the crossref dump presentation by @njahn82.

(This repo metaR is supposed to document best practices and provide wrapper/helper functions to set this up, so I think the result of this work should go in here).

I haven't had a chance to look into this yet @Ahobert and might be a couple of weeks before I get around to it.

I thought about leveraging singularity, which the GWDG HPC supports. This would allow us to reuse existing Docker images, which I'm already trying us to standardise on for CI and easy development.

Also see https://github.com/subugoe/hoad/issues/101, https://github.com/subugoe/hoad/issues/29.

I would caution against using rstudio.gwdg.de for batch jobs, or the instances that are driving this. According to the GWDG, this is not a reproducible environment. It will work sometimes, though probably sometimes not, and in unpredictable manner.

We can sidestep this issue by just shipping our batch jobs as Docker containers, which is one of the use cases Docker was made for.