Closed nickeubank closed 1 year ago
@marlhakizi @nansuwang
Hi @nickeubank, @haoningjiang is working on this and had an interesting point -- Having a PE of 0 for the first iteration (which we'll move to 0 indexing and call it the 0th iteration by the way) makes sense when there's units matched. But, when there's no perfectly matched units in the dataset, potentially having a null PE is also an option in that specific case. Please let us know if you have a preference! we'll go with 0 for the first iteration for all cases by default for now if we don't hear from you
If no units are matched, nan / None makes sense to me.
As of now, an iteration refers to one round of PE computation and matching (independent of whether matches are actually made on the relevant covariate set). There will be as many entries in .pe_each_iter
and .bf_each_iter
as there are iterations.
The PE is the error associated with using a covariate set to predict the outcome and is thus always defined, regardless of whether matches are made. As an example, setting early_stop_iterations = 0
and want_pe = True
means that one round (the 0'th) of matching will be completed (exact matching) and the .pe_each_iter
attribute will be a list of length 1, containing the PE associated with using all covariates to estimate the outcome.
Also running into some indexing confusion with
model.pe_each_iter
among students -- since iteration 1 has a pe error of 0 (all exact matches, right?) it doesn't get included inpe_each_iter
, which means if you index into that, it's off by 1 (or two, given seems DAME-FLAME counting starts at 1, not 0). Probably need to adopt a consistent approach to these.