AndersenLab / wi-gatk

The new GATK-based pipeline for wild isolate C. elegans strains
1 stars 3 forks source link

container works on quest #21

Closed danrlu closed 4 years ago

danrlu commented 4 years ago
  1. to get the docker container work on Quest, need (1) do module add singularity/latest in terminal before starting nextflow run. For some reason this line just doesn't work properly inside of config (2) inside of config: container = 'docker://andersenlab/gatk4:latest'

  2. I added -XX:ConcGCThreads=${task.cpus}" back to gatk HaplotypeCaller as fail safe since the JAVA garbage collection is multi threaded and may exceeds available resource and throw error.

  3. The resource for sm, md, lg in quest.config was adjusted based on available resources on Quest and the actual resource usage from timeline.html generated by nextflow. Please do not modify them.

danielecook commented 4 years ago

to get the docker container work on Quest, need (1) do module add singularity/latest in terminal before starting nextflow run. For some reason this line just doesn't work properly inside of config (2) inside of config: container = 'docker://andersenlab/gatk4:latest'

The container should be specified in nextflow.config as it is used locally (for testing) and on quest. I'm not sure why you need the docker:// prefix (it's probably better to leave it), but it may because you are using a different nextflow version as you commented out the NXF_VER check. I would recommend you leave that in as it ensures that the version used is consistent. You can use direnv on quest to set variables at the directory level automatically. when you enter a directory.

I added -XX:ConcGCThreads=${task.cpus}" back to gatk HaplotypeCaller as fail safe since the JAVA garbage collection is multi threaded and may exceeds available resource and throw error.

ok I would not think this would make a difference. Do you see jobs failing less often with it enabled? The JVM already sets this automatically. Generally, I would think you set it to restrain the number of GC threads.

The resource for sm, md, lg in quest.config was adjusted based on available resources on Quest and the actual resource usage from timeline.html generated by nextflow. Please do not modify them.

None of those changes were committed. Values were only modified for testing purposes.

danrlu commented 4 years ago

I'm not sure why you need the docker:// prefix (it's probably better to leave it), but it may because you are using a different nextflow version as you commented out the NXF_VER check.

docker:// tells nextflow to pull from docker hub. or did you download the container image somewhere that I should have read from? https://www.nextflow.io/docs/latest/singularity.html image

I use nextflow 20.01.0. It is controlled by conda, instead of having NXF_VER setup so I commented that line out.

I added back the -Xmx${task.memory.toGiga()-3}g -Xms${task.memory.toGiga()-4}g for import_genomics_db. reason see my last pull request.

danielecook commented 4 years ago

With regard to Nexflow, I have it installed with homebrew. This allows you to set the version more flexibly without needing to manage installation with condo. I would suggest you enforce some versioning if multiple people will be running the same pipeline.

danrlu commented 4 years ago

Got it! Thanks!