Open magicDGS opened 6 years ago
While a good idea in principle, and maybe the way to go in the future, the current problem is that recent Docker versions are not supported on our many MacOS 10.8.5 nodes.
Oops, I didn't know that!
On the other hand, I suggest that our MacOS system should be updated anyway soon, mostly due to security issues from the spectre/meltdown bug. Thus, the problem with incompatibilities will be solve soon.
Another problem: it looks that it is unsafe to run docker on Hadoop, but 3.0.0 would bring support for that (still experimental): https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/DockerContainers.html
Another reference: https://thenewstack.io/docker-hadoop-theres-good-bad-ugly/
I though that using docker containers for Distmap might be a good idea for remove the dependency of binary files that are system dependent. The main idea is:
--mapper-path
option for a--mapper-docker
one. This might allow to use exactly the same binaries independent of the computers on the Hadoop cluster (e.g., allowing machines with different systems being part of the same cluster).bwa_mapping.pl
command line will usedocker run
for running the command. The difference will be in arguments. This might allow to pull out common code constructing the command line and running it.PATH
if no docker is provided; the base image can be linux/macos depending on the caller system, allowing to support calls from both environments. This is in line with @robmaz idea of using thePATH
for looking for binaries but adding also the operating system to the base image to allow compatibility.What do you think about this, @robmaz?