nextflow-io / nf-schema

Functionality for working with pipeline and sample sheet schema files in Nextflow pipelines
https://nextflow-io.github.io/nf-schema/
Apache License 2.0
12 stars 21 forks source link

Idea: Validate parameters when the parameter has been changed #51

Open nvnieuwk opened 2 months ago

nvnieuwk commented 2 months ago

A nice feature for nf-schema would be some kind of option to make the plugin automatically check a parameter again when it has been changed.

Ideally this feature could completely replace the validateParameters function to work out of the box, but not sure if this is possible as a nextflow plugin without making the code extremely slow. This would need some testing :)

bentsherman commented 2 months ago

What do you mean when a parameter has changed? I would argue that the params map should be immutable in the script, except perhaps for some initial definitions

nvnieuwk commented 2 months ago

Yep I agree with you! But it would be nice to validate these initial changes too. The current way solves this by using a function that gets called after these are set, but I'd rather move the validation to some configuration option to in order to make it even easier to use the plugin. I'm just spitballing here so no idea if it even will be possible but it would be nice

bentsherman commented 2 months ago

To be honest, I think even the script shouldn't be allowed to assign parameters. It is a shorthand to provide some sensible defaults in the script, but really the schema file should be the source of truth for such defaults (or a "params" block in the script but that's still just an idea).

And that way you won't have to go to all of this trouble to do dynamic validation 😉

nvnieuwk commented 2 months ago

Also true, Default assigning has also been requested for the plugin so this would also be possible once we move to a start-of-pipeline validation (https://github.com/nextflow-io/nf-schema/issues/35). But yeah this is probably some nice future thinking and it will be hard as long as params stay mutable but we'll see :p

awgymer commented 3 weeks ago

One major limitation (currently) of inferring defaults from the schema is that they cannot be set dynamically or use interpolated strings.

For the first an example is how we currently assign default value for params.fasta (getting it via a function that looks at what is available in the genomes config).

For the second, the classic case is the params.outdir where people use something like e.g. path/to/outdir_${date}/

nvnieuwk commented 3 weeks ago

Yeah that's true and a valid concern, we should write some kind of resolver for those dynamic defaults. But this feature isn't really possible now that params can still be changed in workflows

awgymer commented 3 weeks ago

I did add something to nf-core schema create at the last hackathon which attempts to warn and fix divergent defaults, but obviously that is not a particularly strong solution!

nvnieuwk commented 3 weeks ago

Yep I saw that! Let's hope it will get easier

bentsherman commented 3 weeks ago

I would say the best practice is to have static default values in the params schema and use a variable in the workflow logic if you need some dynamic expression. You could also generate a params file based on some dynamic values