Closed kevinushey closed 7 years ago
The idea is that the inline flag declarations replace the default
section of the flags.yml file (i.e. it's actually not required as it is with raw use of the config package). So there shouldn't in fact be any duplication. There is a bit of extra bother in this workflow though as you need to declare flag types, but to me this provides both added robustness as well as better inline documentation of valid
flags within the R script.
The flags.yml
as it's used here is really designed for profile-based overrides of defaults, so in many situations won't be provided at all (of course it will nearly always be provided with cloudml, if only to provide indirection on the training and test data).
All of that said, I just noticed that right now we don't seem to be allowing for the omission of the default
key in flags.yml! I'll investigate and hopefully report back with a fix soon...
Okay, I've fixed this issue so you now exclude the "default" section entirely. The 'flags.yml' file is optional, and for your cloudml demos you should be able to:
1) Move the "default" section from the config file into the R script as typed FLAGS declarations
2) Include only cloudml overrides in the config file, e.g. the entirely of the config file could be:
cloudml:
training_data: gs://....
test_data: gs://...
Ahhh, got it! That makes complete sense and I'm on board with that. Thanks for the clarification!
In some initial bits of porting from the old config-based approach to the tfruns flags approach, I saw errors of the form:
Should we consider silently allowing these fields? It does seem a bit unfortunate that one must effectively duplicate the declarations between the
flags()
command invocation and theflags.yml
configuration(s). It might be nice for some workflows to just allow a plain old:and just populate all the flags as discovered in
flags.yml
(and overlay whatever options are specified on the command line). There's probably a reason why this approach wasn't taken but just want to be sure.