I'm trying to limit the resource usage per job so I can optimize the total run time for two or more jobs. How do I do that? I start the impute server like so: docker run -t -p 8080:80 -e DOCKER_CORES="16" -v $(pwd):/data/ --name imputeserver-16cores genepi/imputationserver
And I start each job like so: docker exec -t -i imputeserver-16cores cloudgene run imputationserver --files /data/input.vcf.gz --refpanel apps@hapmap2 --conf /etc/hadoop/conf
I figured I could perhaps change the settings in the files in /etc/hadoop/conf, put them locally and change --conf to point to those files in e.g /data/conf, and so far changing the values in mapred-site.xml doesn't affect the number of created map/reduce tasks. Should it work?
I also see in the terminal output that "/data/apps/imputationserver/job.config" is unavailable, perhaps that's where I can define the number of map/reduce tasks per job? I've looked around but haven't found any documentation about it so I don't know what it does.
I'm trying to limit the resource usage per job so I can optimize the total run time for two or more jobs. How do I do that? I start the impute server like so:
docker run -t -p 8080:80 -e DOCKER_CORES="16" -v $(pwd):/data/ --name imputeserver-16cores genepi/imputationserver
And I start each job like so:
docker exec -t -i imputeserver-16cores cloudgene run imputationserver --files /data/input.vcf.gz --refpanel apps@hapmap2 --conf /etc/hadoop/conf
I figured I could perhaps change the settings in the files in
/etc/hadoop/conf
, put them locally and change--conf
to point to those files in e.g/data/conf
, and so far changing the values inmapred-site.xml
doesn't affect the number of created map/reduce tasks. Should it work?I also see in the terminal output that "/data/apps/imputationserver/job.config" is unavailable, perhaps that's where I can define the number of map/reduce tasks per job? I've looked around but haven't found any documentation about it so I don't know what it does.