Update 1_parse.nf - Githubissues

Thanks for the input. First, for the memory requirement, the CreateSequenceDictionary task doesn't limit the memory given to the JVM, so I've added parameters to set a maximum. I can't imagine this task needs more than the default 1G, but without a limit the garbage collector might not kick in and instead the process will claim more and more memory until the cluster system kills it. The change makes sure the JVM won't use more than the allocated memory (it actually assigns 128MB less to the JVM that given to the task, to allow overhead). I've added some text to Running.md describing how to go about defining memory requirements on a per project basis in the project's nextflow config. So if CreateSequenceDictionary still blows up, this describes how to increase the size given. Second, your change to samtools_faidx is specific to using Ensembl chromosome naming (ours was specific to UCSC). To support UCSC or Ensembl naming without having to change the work flow I've added a parameter CHROMOSOME_ID_PREFIX to nextflow.config which is just a string, but really should be either "chr" for UCSC references or the empty string ('') for Ensembl ones. The default is "chr" but again it can be added to a project's nextflow.config to set it differently. Rich.

nrlab-CRUK / INVAR2

Update 1_parse.nf #4