Closed buckley-w-david closed 4 years ago
I think this is a great proposal! A couple small thoughts:
data
is a little generic in this context. Maybe pulse
as a command name instead?pulse
as a command name makes sense to me, it was just data
since that's what the package is called right now.
On that subject for the same reason that package name can/should also change. Perhaps to the same name.
As a counter to your comment on whitelisting, as the code is the whitelisting is already taking place, just in the command generation process.
. . .
# Allow some options passed to python -m data.update to go
# through to domain-scan.
# Boolean flags.
for flag in ["cache", "serial", "lambda"]:
value = options.get(flag)
if value:
full_command += ["--%s" % flag]
. . .
Those sections are the existing whitelist, we would just transfer that processing into one central place, and be able to pass concrete lists of arguments instead of these options dictionaries.
Now to play devils advocate to my own argument, I can see some value in making suboptions generic and passed around "as is" in the case of arguments meant to processing
once it is refactored into a more general form with "processors" that may have their own arguments, but I think that the scan and gather process is different and will remain mostly static with arguments known at time of writing.
I'd love to get rid of that code sample you included too! ;) It's essentially there because those 3 flags are just pass-throughs to domain-scan, rather than Pulse-specific flags. Maybe there's some better way to effect that pass-through.
I'm not going to be strongly opinionated on this stuff, only that we avoid any significant argument parsing code. We recently worked on adding some argument parsing/whitelisting code to domain-scan itself, and the code ballooned substantially and created some instability for a time as we found regressions. I'd rather have things documented clearly and have the interface feel rational and unix-style, and am less excited about making the argument processing itself substantially more complex.
Argument parsing is of course no fun. Have you heard of/used the click library? It's a really easy to use CLI library that I personally really like (and would be a proponent of using for this).
The (mostly) complete initial implementation of the above proposal using it is in a PR on our fork right now https://github.com/cds-snc/pulse/pull/3/files, take a look if you're curious.
click
looks great, I'd be happy to give it a go.
Thank you for your contributions to this project over the year. The website has been decommissioned, so I'm going to go ahead and close this issue.
As discussed in https://github.com/18F/pulse/issues/775 we're in agreement that the CLI can/should be changed.
I suggest an interface along this line:
With that configuration the
update
command passes along SCAN_ARGS to the scan and gather sections of update, formatted so that it is equivalent to the current options dictionary that they receive now. As it is there it would accept anything, but I would like that to eventually be replaced by a concrete list of options that it can pass along. Maybe we don't even have this intermediate step of passing along what we are given and just go right to that concrete list. Thoughts?Let me know what you think, if anything needs changing.