Open vreuter opened 7 years ago
to take this a step further, we may just want at some point to implement a config file parser/checker, that reports on the health of your config file. It could do a bunch of stuff like this to suggest places that you could improve. this seems like a good thing to thing about in the longer term when PEP becomes more widespread
Cool, I like the sound of that.
parser/checker of config file
- Do we still want to implement it? How should it look like? Should we have generic config schema?
This idea came up when I was writing a configuration file that uses the
subprojects
section. For each of my subprojects, I was defining an alternateoutput_dir
andsample_annotation
, but I was nesting these directly under the subproject name itself rather than within ametadata
subsection.This led to what seemed like a failure by
AttributeDict
to substitute duringparse_config_file
the subproject-specific values for the general project ones that had been redefined. Whilesubprojects
may not be a heavily used feature, I could see this being a braces/grimaces at double negative not-infrequent error. It's not a big deal if a user is being careful and first usingdry-run
, but if the submission was actually done and caused way more samples than had been intended to be run to be submitted, that could cost a lot of unintended compute time/$. Either way, the user would need to be able to figure out what was wrong with the config file, which may not be entirely intuitve.I think we've discussed keeping the config section name definition framework as flexible as possible. I definitely agree, but I think that there could be some value in, say, using knowledge of keywords like
output_dir
andsample_annotation
to suggest proper placement (i.e., some sort of warning if they're not present but not placed withinmetadata
). The keywords that come to mind are the commonmetadata
ones...output_dir
,sample_annotation
,results_subdir
,submission_subdir
,pipeline_interfaces
.