Closed joverlee521 closed 9 months ago
Thanks for the feedback @huddlej! I fully agree the term "profiles" can be confusing. I'm hesitant to use the term "builds" because it's so overloaded, especially in the Nextstrain ecosystem.
Maybe just "customizations" is descriptive enough? Using the phylogenetic workflow as an example:
phylogenetic/
├── README.md
├── Snakefile
├── config
│ └── defaults.yaml
├── customizations
│ └── ci
│ ├── config.yaml
│ └── copy_example_data.smk
├── example_data
│ ├── metadata.tsv
│ └── sequences.fasta
└── rules
├── annotate_phylogeny.smk
├── construct_phylogeny.smk
├── export.smk
└── prepare_sequences.smk
@joverlee521 Why separate config/
and customizations/
in that suggestion? (or in the PR as-is, separate config/
and profiles/
)
I'd think this (or something similar) would be just fine?
├── config
│ └── defaults.yaml
│ └── ci
│ ├── config.yaml
│ └── copy_example_data.smk
Why separate config/ and customizations/ in that suggestion?
I proposed to keep them separate for two reasons:
config/
simple with just the default configs and not bloat it with customizations. config/
. (1) is based on experience with the seasonal flu repo, which has a pretty full config dir with sub-directories for the subtypes. It might be confusing to see the current profiles mixed in:
config
├── allflu
├── ci
├── distance_maps
├── europe
├── example
├── gisaid
├── h1n1pdm
├── h3n2
├── hutch
├── nextflu-private
├── nextflu-private-forecasts
├── nextflu-public
├── nextstrain
├── nextstrain-public
├── private.nextflu.org
├── scicore
├── upload
├── vic
└── yam
We could keep them under config/customizations/
for clearer separation. My concern for (2) still stands in this case, but maybe I'm overthinking it...
config/
├── customizations
│ ├── allflu
│ ├── ci
│ ├── europe
│ ├── example
│ ├── gisaid
│ ├── hutch
│ ├── nextflu-private
│ ├── nextflu-private-forecasts
│ ├── nextflu-public
│ ├── nextstrain
│ ├── nextstrain-public
│ ├── private.nextflu.org
│ ├── scicore
│ └── upload
├── distance_maps
├── h1n1pdm
├── h3n2
├── vic
└── yam
Oh I see. Yeah, that makes sense.
Do we have a working definition of "config" vs "customisation" vs (snakemake?) "profiles" to help me interpret this? (And maybe add "defaults" to this too.)
If we didn't use snakemake profiles for the seasonal flu would the files in profiles/
be additional directories within config/
?
The linked seasonal flu repo is certainly complex in the sense of lots of directories, and perhaps necessarily so, but it'd be great to avoid introducing this complexity into repos which don't need it.
More generally, I wish we decoupled our profile configuration files ... from our build configuration files ... since they are such different concepts.
+1
Do we have a working definition of "config" vs "customisation" vs (snakemake?) "profiles" to help me interpret this? (And maybe add "defaults" to this too.)
Definitely missing explicit definitions, my interpretations are:
In my definition, I am basically merging config and defaults, where anything under config/
should be considered the default configs used for a workflow. This makes it very easy for users to see what the default configs are for the workflow at first glance.
If we didn't use snakemake profiles for the seasonal flu would the files in profiles/ be additional directories within config/?
I think I would still keep it separate for the sake of keeping the default configs very simple and obvious.
Closing in favor of https://github.com/nextstrain/pathogen-repo-template/pull/27
Description of proposed changes
Reduce confusion of the use of the profiles' config files. They are to be seen as additional profiles' config on top of the default configs. Stems from discussion with @j23414 for review of https://github.com/nextstrain/dengue/pull/15.
Note we not using
config.yaml
because this is used by the Snakemake profiles configuration.¹¹ https://snakemake.readthedocs.io/en/stable/executing/cli.html#profiles