Closed nservant closed 4 years ago
@drpatelh , even in the ChIP-seq or ATAC-seq projects, all parameters do not have the same format. Should I wait a bit before changing all my parameters :) ?
They should be! The only parameter is maxMultiqcEmailFileSize
which I have now updated in the template (see https://github.com/nf-core/tools/pull/425). Which ones have you spotted?
What about --bwa_index
, --gene_bed
, --tss_bed
, --macs_gsize
, --mito_name
, etc. ?
Ah, sorry. Maybe I didn't explain properly. Any parameters that require a user input e.g. (file path, integer etc) are in snake_case
format and any that are boolean switches are in camelCase
format. Maybe we can use this thread for discussion.
@ewels @apeltzer What do you think about this? This is something we initially discussed way back in 2 years ago Barcelona at the NF conference. Quite a few pipelines are now adopting this approach e.g. rnaseq
(mostly I think), chipseq
, smrnaseq
and a few others. I have also change the template yesterday for consistency. If you both agree we can create an issue on the website to add some docs. Mainly so it's not lost in the ether again :sweat_smile:
Ah !! ok, I didn't get it. This is up to you. I will update according to your best practice. If I can just make a comment, my feeling is that choosing a single format for all variables is much more convenient (and easy to follow for new developers).
My feeling is also that it would be simpler to just have one and stick to it (my preference would be snake_case
). I know we discussed this before, I don't remember why we wanted both..
Its easier to distinguish parameter type? Personally, I think its easier to work out what a parameter is doing i.e. does it require some sort of input or is it just a switch. I think you probably done this subconsciously for the NGI pipelines because they had parameters like saveAlignedIntermediates
, saveReference
, singleEnd
that were boolean and then others that werent e.g. bwa_index
.
But surely users should know what parameters do before using them..? 😉 And we should have parameter validation pretty soon. And personally I find it harder to remember which parameters use which style.
I think that the discrepancy in the old pipelines was more down to who wrote the code - me or @Hammarn 😉
So the beauty of having the two styles is that you can pretty much figure out what a parameter is doing before you know what it's doing or even before using it!
Ok. Lets put it to the test! Probably not the best example because you know what most of these parameters are doing already (:wink:). If you had to look through them without knowing anything about what they are doing doesn't it help to figure out which ones require an input and which ones don't? To add to that the case distinction allows the boolean
ones to be pretty much self-explanatory. Just gives some visual partitioning to a massive parameter set that would otherwise all look the same :smile_cat:
https://github.com/drpatelh/chipseq/blob/b2fdfd82b76262d57478a81913298189b983badf/main.nf#L22
Why not use the [type]
notation in the help docs, as already done for --clip_r1
and others? More explicit, more consistent and easier for beginners 😉
--single_end [bool] Specifies that the input is single-end reads
--seq_center [str] Sequencing center information to be added to read group of BAM files
Or maybe [flag]
for the bools to make it clearer. Or nothing for those, but [str]
for anything with an argument... But you get my point anyway.
Ok. Having thought about it a little, I'm coming round to using snake_case
for all parameters. It is probably best for standardisation. As you mentioned, we need to consistently enforce the use of [type]
notation in main.nf
and in usage.md
because most people won't look at the main script.
Let's agree on this so I can update the template to reflect this and we will need to add some docs to the website too. I have a few pipelines that are ripe for release now and so I can make the update sooner rather than later.
@nf-core/core are we all in agreement? Speak now or forever hold your peace :wink:
Nice! This should go into the guidelines that @MaxUlysse is starting to draft. He maybe has strong views on the issue?
Note that we should be auto-generating the pipeline help text from the parameters settings file soon (https://github.com/nf-core/hlatyping/pull/61), so that will make that part much easier. And we can / should also lint the documentation from that (https://github.com/nf-core/tools/issues/111), so can also enforce standardisation there.
Snake case is fine by me - as long as we all adhere to it and make things as the automated linting by @ewels possible ;-)
Ive made some headway with this for the atacseq
and chipseq
pipelines:
https://github.com/drpatelh/nf-core-atacseq/blob/7e2c171753bebe093e129d2163041cc54b64435e/main.nf#L21-L87
Any comments? Also changed --design
to --input
for standardisation.
Nice!
Thanks! It's one of those unwritten rules that we've discussed and implemented to various degrees in some of the main pipelines but haven't had the time to officially document :sweat_smile: You could change the other parameters too? If it helps, if things do change then I'll have to update numerous pipelines! I've also updated the template today to change the last remaining parameter to this format... https://github.com/nf-core/tools/pull/425/files#diff-4ac32a78649ca5bdd8e0ba38b7006a1eR13
Originally posted by @drpatelh in https://github.com/nf-core/hic/pull/37