Closed kuriwaki closed 2 years ago
counties
. If you do things like merging small counties that you really don't want to be split, this will lower the total splits in general, though not guaranteed.splits
is not currently an option in redist_smc
, as the arguments are not currently passed on to the Rcpp that actually runs SMC. If you set constraints(splits = list(strength = 100))
, it won't break anything, but it also won't be accessed. splits
is a soft Gibbs constraint, rather than a hard algorithmic constraint, so they are very different.splits
, as they do have different meanings. We could call it the more generalized term, like political_subdivision
, but I'm not sure if that adds more value than it costs. Happy to hear other ideas for names, if this is something important.I see, thank you.
If there are 4 districts and only 2 counties (city vs. non-city), then the constraint would be to "only generate maps which split up to 3 counties", which is never binding here (there are only 2 counties to split). And yet, we see differences in simulations. Is the counties argument doing something else?
Got it. I think noting your answer here in the manual would be clarifying. Or throw a warning if a user sets splits in SMC. We were about to make this mistake.
Re: 1, what's going on is that the spanning trees are being drawn first at the county level and then are joined together with a meta-spanning tree. The result of this is to lead to partitions which tend to follow county boundaries. It also guarantees the maximum number of county splits. But as you note, even with more districts than counties, the way the trees are drawn still makes it useful.
Comment 2 is very linked to #96. The current constraint checkers are designed largely to avoid breaking errors in the Rcpp, not detect user errors. I think we can make #96 a 3.2 priority to ensure that people aren't doing bad things without knowing it (or just aren't doing things).
On 1, yeah, the idea of drawing the spanning trees in the counties (as Cory mentions) is really important. You get much, much more realistic plans this way. The plans drawn without counties
are fairly far off from what states really enact, whereas we can sample pretty similar plans using counties
, even when it isn't binding at the max, but is useful district-by-district.
I think noting this mechanism in the docs could be helpful.
In the @param counties
in redist_smc or the details, I'd specify something along the lines of "this will draw spanning trees within each of the counties specified. There is no strength parameter associated with this option. Even there are fewer counties than ndists - 1
, the spanning trees will change the results of the simulations.".
The @param counties
in redist_mergesplit can specify that the strength of that "soft" Gibbs constraint is specified by the "splits"
constraint.
An error or warning for mismatching constraints would be great.
Some of the documentation for the
counties
argument is a bit hard to navigate.only generate maps which split up to ndists-1 counties
. Is there a parameter to change that ndists-1 threshold?splits
constraint? That constraint is not mentioned inredist_smc
but it is inredist_mergesplit
. What happens if I setsplits = list(strength = 100)
in redist_smc also with counties?counties
naming is too US-context-specific -- we use it in Japan and city boundaries as well. Could there be an argument (e.g.splits
) that simply duplicatescounties
so users can use either?cc @tylersimko