CDCgov / cfa-epinow2-pipeline

https://cdcgov.github.io/cfa-epinow2-pipeline/
Apache License 2.0
10 stars 2 forks source link

Providing defaults to `fit_model` and more generally #33

Open athowes opened 3 weeks ago

athowes commented 3 weeks ago

In PR #26 we added modelling functionality, and chose not to add defaults for arguments like the prior:

I can add defaults, but I purposely didn't and don't think we should. My thought was that we don't plan to use this function (or package) interactively. We're going to pass a task config with all parameters and won't depend on default arguments. If we misspecify something, we want an error not a silent reliance on a default argument with a (potentially) different value. If the user doesn't add, e.g. priors[["rt"]][["mean"]] to the list, wouldn't it return NULL? What would happen then? Would EpiNow2::rt_opts() return an error, or just silently continue on with some default value?

image

I think this error is the desired behavior. We don't want the error, but with a config-driven rather than interactive workflow I think it's safer not to specify defaults. Is that convincing to you both?

_Originally posted by @zsusswein in https://github.com/CDCgov/cfa-epinow2-pipeline/pull/26#discussion_r1755154014_

I think this is a reasonable choice / likely correct.

I do think there could be a discussion to be had about the extent to which we are making this package "config-driven" versus "interactive". One could frame "interactive" as in "working like ordinary R package functions". And then one could argue that the pipeline should do the work of being config driven, and that functions in the R package could be functional from an "interactive" perspective / ordinary R package perspective. I do think there are instances where we / our collaborators might interact with the package in an interactive mode.

Sorry, this is quite a non actionable issue! I'd be interested in others broader perspectives on this as a broader design question. As I say, I think the choice to make things quite "config-driven" is reasonable.

I think it would be helpful to see a document put together of the expected workflow. Perhaps another part of this "package specificity" question is the extent to which this package can be used outside of an Azure / CFA-specific environment. I think it's important for documentation and development of the package that one could use it in a more basic environment.

zsusswein commented 3 weeks ago

I'm dodging the majority of your good questions here to think a little more, but to answer a few specific things:

I think it would be helpful to see a document put together of the expected workflow.

You've (rightly) asked for this a few times and it's an important thing we need to provide. Bear with me for a few more PRs until we have an actual pipeline-running function and let's revisit. I'm trying to avoid documenting ahead of the code.

I think it's important for documentation and development of the package that one could use it in a more basic environment.

Can you expand on this a bit? We have in the readme:

https://github.com/CDCgov/cfa-epinow2-pipeline/blob/c04c8f857e49530ffad946ea582f404a42098735/README.md?plain=1#L9-L10

But it sounds like you have a broader scope in mind. What use-cases are you thinking about? In particular, when would one want to use our package over vanilla EpiNow2?