TORCH-Consortium / MAGMA

A pipeline for comprehensive genomic analyses of Mycobacterium tuberculosis with a focus on clinical decision making as well as research
https://doi.org/10.1371/journal.pcbi.1011648
GNU General Public License v3.0
13 stars 3 forks source link

Add a config file for the low memory users #184

Closed vrennie closed 10 months ago

vrennie commented 11 months ago

@mdediegofuertes do you feel the Readme is now clearer in differentiating the config files so that you know which to invoke and why?

mdediegofuertes commented 11 months ago

Hi @vrennie,

I do, but I'm afraid that the role of the default_params.config file is still a bit unclear. This config file still allows you to specify an input samplesheet and an output directory, as well as define some of the boolean options like using exit-RIF. For an unexperienced user (and sometimes for Junior and Senior Bioinformaticians too), this can very easily lead to confusion. In my opinion, the roles of the three config files should be totally compartmentalized, so that there is zero overlap between them. It would also help if the three files were in the same directory:

default_params.config: The parameters and cutoffs related to the analyses performed by MAGMA. Things like minimum breadth and depth, k length, etc.
params.yaml: The parameters related to the specific run that is about to be performed: input sample sheet, output directory and boolean options (exit-rif usage, variant recalibration optimization, whether contamination is expected in the dataset and the option to only validate fastqs). While we are at it, it could also be interesting to add a new flag here that allows the user to automatically stop the run before doing the phylogeny, which is something that would have been useful for us in the past.
nextflow.config: The parameters concerning hardware requirements and/or preferences (i.e. cpus and memory)

In the past, the role of the params.yaml file was contained within the default_params.config one, but since this is no longer the case, the current structure of the default_params.config file makes it confusing as to which config file governs which aspects of the pipeline.

vrennie commented 11 months ago

Hi @mdediegofuertes,

Thanks for the feedback. I added some more clarifications to the ReadMe. The nextflow.config and default_params.config should not be modified by the user. The params.yaml file overrides what is in the default_params.config if there is a conflict.

Do you feel this is clearer now?