Closed zsusswein closed 1 month ago
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 100.00%. Comparing base (
ff136bf
) to head (a890538
).
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Alright I've spent the last week going in circles on the feedback (1) around dates and (2) on formatting inputs as a dataframe vs independent vectors. Here's where I landed:
RtGam(df[["cases"]], df[["dates"]]
) rather than the dataframe. It's convenient to avoid requiring specific column names, but I think the real benefit is the explictness and clarity for the users and in the code. By passing vectors, it's clear what's required, where it's supplied, and we avoid the temptation to have mutable objects or in-place changes to dataframes. I think {EpiEstim}
benefits from this clarity -- it takes a simple incidence vector and the use is very clear. Likewise in python, statsmodels
takes the outcome as a distinct vector from the predictors (but does allow predictors as a matrix). I like this clarity in the API and would like to keep it for now unless things become quite unwieldy. tl;dr I've gone back and forth, but I want keep things as-is for now. Thanks to @kaitejohnson for an outside perspective on these choices.
Thoughts @seabbs? Are you good merging as-is?
I'm happy to see this merged and happy that a lot of thought has gone into this.
supply the individual columns of this dataframe as vectors (e.g., RtGam(df[["cases"]], df[["dates"]]) rather than the dataframe
However, I don't really agree with this design decision and find the arguments in favour of it quite weak. In particular, calling out EpiEstim
as having clarity in its design language seems problematic to me as people regularly complain about this to me. My particular concern is that it locks you in to having a specific set of vectors always which means if you did want to later support a formula interface etc it would be quite hard.
In interest of keeping things moving towards a 0.1 release, I'm going to keep things as-is for now and move this conversation to an issue. I'll assign the issue to the 0.2 release.
This PR takes the stub codebase and replaces placeholders some initial structure. It implements
The errors are classed and pass the calling environment to make the error messages more user-friendly. This follows the guidance in
rlang
.The eventual user flow will use
RtGam::RtGam()
as a user-facing S3 class constructor with internal helpers and validators. There will be an additional dataset constructor that does input value checking (eg., no repeated dates within groups). I originally intended to put that constructor in this PR, but I decided it made more sense to pause for feedback here in a bite-sized chunk.In particular, feedback on (1) whether there's any required data input missing and (2) the format of the data input (strict on what's required, vector per item) will be helpful.
Closes #7