Closed josherrickson closed 2 years ago
Regarding non-support of clusterIds=
in ate()
and ett()
calls: Are
you reporting that you tried to make this work and couldn't get it to go,
or just that you didn't get as far as taking on that project? (If the
former, are there commits or other residues of the unsuccessful
experiments?)
(Thanks much for this helpful listing, Josh!)
Neither; clusterIds
is fully supported inside ittestimate
; it just needs to be either a) pushed down the call stack , or b) made into a helper function. It's a very low hanging fruit that I'll probably try and knock out if I get some free time this afternoon.
Good that ittestimate()
supports it, Josh. I support wrapping up that
line, as you propose.
At the same time, I'm not ready to abandon the original concept, with a
clusterIDs=
argument to each of the weighting functions that permits
their being invoked within calls like
glm(<...>, data=mydata, weights=ate(mydesign, clusterIDs=<key(s) common to
mydesign & mydata >))
I.e., I do also wish to serve use cases calling for design-determined
weights for use in non-lm()
based modeling, and with the data at a
different "level" than that of the design table. (E.g., it's a
school-randomized study, so that design info is carried in a table of
schools, and yet the analyst is modeling off of a student-year table.) In
some of these cases ittestimate()
will be appropriate to finish the
analysis off, but in others the user may wish to pass it into lmer()
or
other setup customized for the handling of repeated measures. Here the
ett(foo, clusterIDs=c("bar", "baz"))
form that I had proposed serves the
important role of allowing us to store the design-related info along with
the fitted lmer model, hidden away in the (weights)
column of its model
frame. So for this purpose I'd like this form to stay on the agenda, for
the time being.
Sorry if it was unclear - I fully intend to support that, I just implemented clusterIds
inside ittestimate
first and haven't gotten around to porting it into ate
and ett
.
Ok, it's done - ate(des, clusterIds = list("oldname" = "newname"))
is supported.
Terrific! (Sorry I misunderstood.)
(I'm checking off several cov_adj()
concerns that aren't so much resolved as moved into issues #2, #3 or #4, all with the cov_adj()
project .)
@josherrickson,
RCTDesign
? I may have used things like that in the proposal -- this may well be the origin of your upper camel case function names -- but for the actual package I think I'd prefer rct_design
. forcing()
should support multiple variables: Yes, it should. (As it does, thanks!)Names are completely replaceable; as noted I mostly just took the cues from the document. I do think we should strive to introduce some consistency/rules for names since we're starting from scratch.
What about just design
, and have a type =
argument? It offers more flexibility if we're going to be adding more types of designs going forward. Otherwise see my response above
Thanks J. while we're listing aspects of the naming scheme for possible revision, I should add that "design" seemed a good thing to write in the proposal, but isn't necessarily best for the software.
DeclareDesign
team sort of beat us to the punch."Assignment" might fit better, but it's long, easy to misspell and too similar to existing function names, such base::assign()
's.
Something related to "Data"? DataDesign? DataStructure? DataOrganization? DataComposition?
... Since posting about this nomenclature issue, I've been thinking "rct_groups()
", "rdd_groups()
", "matched_groups()
" and "obs_groups()
".
Rationale:
z ~ strata(foo) + forcing(bar)
, as first argument to any of these functions, the very first thing she's communicating to us is the composition of those groups. That's the one specification of groups that we expect which is mandatory, conceptually at least, strata and clustering each being present for some designs but not others. I believe everything on this list is either addressed or outdated, or not relevant anymore.
Structural
Design
supportc
/rbind
?Design
, error if the same variable is used in multiple placesforcing()
supports multiple variables. Should it?Design
replacers do not support changing number of variables (e.g. if you created theDesign
withcluster(a, b)
, you cannot replace with a single-variable cluster ID).@type
, but RD requires adding/removing forcing.DirectAdjusted
output is justlm
output. Want that to emphasize treatment effect.DirectAdjusted
validity checker should be more comprehensive to ensure users don't pass malformedlm
intoas.Directadjusted
.cov_adj
should take atype
argument to handleresponse
/link
forpredict.glm
. Are there others for other models?cov_adj
just has atryCatch
forpredict
; do any of the desired supported models require more bespoke handling?ittestimate
needs more input sanitization.lm(..., weights = ate(des))
is supported but extremely fragile.ate
andett
don't supportclusterIds
(onlyittestimate
does). HandleclusterIds
in helper function.ittestimate
should support aweights
argument for users wanting to pass something likemywt*ate(des)
. Ignore "target
" argument inweights
is non-null.ate
andett
need to supportclusterIds
first.weights
inittestimate
, if user passes aWeightedDesign
object, handle appropriately (move weight toweights
, extractDesign
).Stats
Design
objects, it's not implemented. (Aside from only RD supportingforcing
.)ett
andate
generate dummy weights.vcov.Directadjusted
andconfint.Directadjusted
, these obviously need to be expanded and any additional postho functions implemented.cov_adj
just runspredict
.