ellisrichardj / BovTB-nf

Nextflow script for processing WGS data
2 stars 5 forks source link

Restrict memory usage of AssignClusterCSS process #23

Closed ellisrichardj closed 3 years ago

ellisrichardj commented 3 years ago

Unfortunately the memory directive does not work when executor is set to 'local', so implemented a different approach. Essentially the high memory usage was caused by 'Stage1-test.py' reading the data for the whole genome into memory, but it only requires the data for the genome positions which are discriminatory. Now the vcf file is filtered to just include the discriminatory positions, thus making the python script much more memory efficient. The report now shows the maximum memory usage is now just 11 MB.

ellisrichardj commented 3 years ago

Assigned to the wrong repo