np-core / nanopath

Python package and command line interface - entry point for the repository :snake:
Other
5 stars 0 forks source link

Container management #2

Open esteinig opened 4 years ago

esteinig commented 4 years ago

@dn-ra - do you know some good practices for container management in Nextflows? At the moment I am using Singularity containers with all bits and pieces installed, but i.e. when you deploy to Google Cloud, is it beneficial to manage each rule's dependencies as a single small container that can be pulled as they are deployed?

dn-ra commented 4 years ago

Nope that’s beyond my understanding at the moment. My next step with learning all this was to build some nextflow pipes with cloud deployment in mind. I’ll keep an eye out.

Just to clarify, you’re wondering whether it’s better to put each dependency in its own container that can be executed as needed?

Thanks,

Dan On 11 Feb 2020, 5:20 PM +1100, Eike Steinig notifications@github.com, wrote:

@dn-ra - do you know some good practices for container management in Nextflows? At the moment I am using Singularity containers with all bits and pieces installed, but i.e. when you deploy to Google Cloud, is it beneficial to manage each rule's dependencies as a single small container that can be pulled as they are deployed? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

esteinig commented 4 years ago

Yes, so my (very limited) understanding is that when you deploy in the cloud you need to pull containers on the go, is that about right? .

esteinig commented 4 years ago

I'm not quite sure how Nextflow deploys to Google Cloud - if each rule is potentially executed in a new instance, then small containers that only manage dependencies of each rule might be essential.

Thanks for keeping an eye out, that will be good to know :)

esteinig commented 4 years ago

I should say the container location for the current (very simple) Nextflows is configured as e.g. $baseDir/np-sepsis.nf so when you work on the pipelines you need to build the Singularity image file in the pipeline director and run with -profile local

@dn-ra

dn-ra commented 4 years ago

So having done some reading on this it seems best practice is to make small containers for each segment of the pipe that can be joined together when required.

Have you ever used/read about Kubernetes? It has nextflow integration and could be the framework we need for container management.

esteinig commented 4 years ago

Possibly, it might be a bit overkill for what we need right now. Some Docker containers to deploy the app and database, link them into a network and run the server pipelines monitoring a directory. Are you gonna have a look at Kubernetes re your cloud efforts? Keen to hear if you do.

dn-ra commented 4 years ago

Yeah having read a bit more it's definitely overkill. Gonna be a while before we end up needing it if we only have a couple pipelines to start off with.

esteinig commented 4 years ago

Hey so I was thinking about trying to setup the entire app - pipeline - database - report cycle in some Docker containers, like a little swarm that can be deployed on the server and the local sequencing computer.

esteinig commented 4 years ago

Sorry for the long wait on this. I have an idea how to handle the live runs will update soon.