Closed hafen closed 6 years ago
Hi @hafen
Thanks for submitting such a fantastic PR. It is thorough and very clear. Please see below for responses to the questions.
calculate_cell_size()
should be renamed and the recommendation of calculate_grid()
should be adopted. The current name is a bit of a hangover from what's happening underneath but this is not relevant to the user. Questions raised in the body of the PR
calculate_cell_size()
made learningrate have default of 0.03. Is this a good default? For now yes. I need to investigate the impact of this on different sized grids but recommend 0.03 for ~ 4 - 80 geographic units (e.g. states) (potentially lower for greater numbers of units). A few questions in response if i may:
This all sounds great. I just updated my fork to change calculate_cell_size()
to calculate_grid()
throughout, clarified about the use of the Hungarian algorithm, and added myself as a contributor. I also un-exported get_shape_details()
and removed that as an argument to calculate_cell_size()
(it's now just called inside that function, saving the user a step).
Given the function name change, these changes are breaking from prior usage and examples, so I also kept the old functions calculate_cell_size()
and get_shape_details ()
which now just display a hard deprecation error and point users to the new functions. Hopefully that is sufficient to get existing users oriented to the changes.
The only outstanding thing is the package name. If you change it, I'd suggest renaming the github repo as well (you can do this without losing your forks / stars, and the old link will redirect to the new one). This is a major change since people already know it by this name, and because of that I was hesitant to suggest it. I'll leave it entirely up to you. The reason I bring it up now is it's easier to change now than when it's on CRAN and more people are using it.
This looks a bit lengthy, but in all not much of substance was changed in the package. There are no breaking changes - previous code should still run just fine. Please let me know what you think about the questions raised.
Overview of changes:
devtools::check()
now runs with no issues, due to the updates below.calculate_cell_size()
is now of class "geogrid".calculate_cell_size()
madelearning_rate
have default of 0.03. Is this a good default? If not, what would be a good default?calculate_cell_size()
madeseed
default toNULL
, in which case seed isn't explicitly set.calculate_cell_size()
, madeshape_details
default toNULL
, in which case it will be automatically computed by callingget_shape_details()
on the providedshape
argument.get_shape_details()
from the examples. Since it only seems to be used inside ofcalculate_cell_size()
and since it appears to be very fast to compute, it seems fine to have the default behavior to be thatget_shape_details()
is not needed to be called directly by the user but only as-needed insidecalculate_cell_size()
. If there is never need for the result ofget_shape_details()
outside of being used insidecalculate_cell_size()
, I'd even argue that we go a step further and not exportget_shape_details()
and remove the option to supplyshape_details
as an argument tocalculate_cell_size()
.verbose
argument tocalculate_cell_size()
to allow user to specify whether messages should be printed while the algorithm iterates.print()
tomessage()
incalculate_cell_size()
.Questions:
calculate_cell_size()
have a different name? likecalculate_grid()
? The namecalculate_cell_size()
makes one think that you are getting a cell size back instead of a grid.geogrid
or if geo is too specific since it works for any polygons, maybe something else? This is only a suggestion.assign_polygons()
.