Closed josherrickson closed 2 years ago
I agree that 1. is the preferable behavior, and am also open to supplanting the logic of .fullmatch.with.recovery()
. But first I'd like to understand more specifically how LEMON's implicit recovery is working.
It may in effect be implementing a version of #200, which would upgrade to the current .fullmatch.with.recovery()
logic. I.e.,
Does this sound like what's going on? If so, then perhaps it's what we should be doing even in cases where we're going to use the RELAX-IV min-cost flow solver at step 4. As an alternative and improvement over the current recovery logic.
If something like this is what's going on, I'd like to understand more about the details. Ordinarily, reductions of the flow parameter in (3) above would correspond to reducing omit.fraction
, but in extreme cases they might also necessitate reduction of the number of treatment group subjects to be matched. Perhaps in those cases the problem should just be declared infeasible. I'd like to understand whether those extreme cases can arise and how they're handled.
Ben and I spoke offline, and I did more research. There are four algorithms implemented in LEMON's mincostflow: Cycle Cancelling, Capacity Scaling, Cost Scaling, and Network Simplex. At the point of running the above, the default was Network Simplex as LEMON describes it as the fastest. However, as I found and Ben discussed, it does some sort of manipulation when presented with an infeasible problem.
On the other hand, the Cycle Cancelling algorithm produces results that are most similar to RELAX and cruicially for the issue at hand, simply errors when presented with an infeasible problem.
Therefore we are changing the default to Cycle Cancelling. (For the record, when calling fullmatch
, the solver
argument can take in either "RELAX-IV"
or "LEMON"
, or the LEMON
function, e.g. LEMON("CostScaling")
, so all four algoriths are available.)
When we have recovery, both LEMON and RELAX-IV produce the same match. However, if we disable recovery, things are different.
On the plus side, LEMON appears to have "recovery" baked in automatically. However, because we do some preprocessing, the matches differ quite a lot.
The preprocessing for recovery first re-fits without
max.controls
, then figures out how many controls to drop to ensure all matched sets are withinmax.controls.
The unrestricted match is:(Note that there are 22 controls). So what's happening here is that:
omit.fraction = 9/22
.omit.fraction = 2/22
What's the best move forward here:
fullmatch_try_recovery
option, but it feels the most true to the existing codebase. We'll need to expand the documentation to explain that thefullmatch_try_recovery
option functions slightly differently by solver.I'm leaning towards 1. LEMON's "recovery" appears to me to be preferred over our hacky version. I don't think its a big deal that the two solvers produce different results in some scenarios; they're different implementations overall.