dmlond / arangs2015

Repository for Automated and Reproducible Analysis of NextGen Sequence 2015
MIT License
2 stars 13 forks source link

What needs to go here? #1

Open rvosa opened 9 years ago

rvosa commented 9 years ago
dmlond commented 9 years ago

right now I have just created the Readme.md, doc/ARANGS15/ (but I have not pushed them). I was thinking we could do intro/, basic_workflow/, virtualization/, containerization/ And put any resources we need (e.g. Vagrantfile, Puppet Configuration, Dockerfile, build scripts) in these subdirectories. We could also do our assignments as subdirectories with README.md under these as well.

rvosa commented 9 years ago

Doesn't it make more sense to do the directory structure as we recommend (doc, bin, src, data, conf) and have the course materials under doc?

dmlond commented 9 years ago

I am not sure. Definitely data, src and conf. I could see bin being useful for virtualization, in that the entire directory is mounted into a VM at run time, then all the bin/ scripts and programs would be run in the VM instead of on the host machine. I am not sure how this maps to a Dockerized pipeline context. The idea with docker is that you dont store binaries, and you dont store wrapper scripts in any centralized location. Instead you use the docker registry to store docker images, and instantiate docker containers on your host data directories. e.g. instead of installing bwa on your machine, you create an image with a wrapper script, and docker run --volume ./data:/data --volumes-from reference bwa_wrapper. So you lose the concept of 'bin' with docker. Each part of your pipeline is modularized into its own image, built with a Docker build context (a directory with Dockerfile, and any wrapper_scripts, installation scripts, configuration scripts, etc. needed by the main application provided by the image produced). The Docker build context directories could go either into src or conf, since they are primarily plain text, or maybe tarballs to install certain versions of compiled software. One thing that might go in bin in a dockerized context are wrapper scripts around docker, e.g. scripts that call docker run ..., but this is starting to get less useful with docker-compose.