Open mitchelloharawild opened 2 years ago
This is a very nice example, thanks for sharing.
I will apply this workflow when modifying the current geogrid::calculate_grid
function and will inform you of any updates.
The draft function has been pushed to GitHub as calculate_grid.R
.
The function is not complete, and debugging is required.
Regarding the issue of obtaining the best number of hexagons automatically, I think maybe letting users define an id
variable will be one solution since the number of hexagons will be altered by the user's will.
Meanwhile, I will attempt to achieve your suggestions: determining the number of hexagons by maximizing the overlapping area between regions.
Well, id
used here is simply the row number from the <sf>
dataset. This is because each row corresponds to a geometry (say a statistical area or electorate). So I don't think the user needs to define the id
variable, we can generate it and only need it as an intermediate linking variable for computational purposes.
It should however be possible to provide a column of weights, indicating that each region may warrant more than one hexagon (say to weight by population).
Sure, I will take the id
off and replace it with weight
to amplify the importance of some regions if needed.
Yet, I wonder how will the weight
emphasize the effect of a particular area.
Should I add it when selecting the area with maximized intersection area (via top_n
)?
Or, firstly calculate the best_hex
(the best number of hexagons used for the task), then add this weight
to update the calculation of the number of hexagons (i.e. update best_hex
).
Does the weight
need to be scaled before inputting in the process? What if sometimes the weight
is a categorical variable (specifically an ordinal one such as "good > moderate > bad"), do we convert it for the users or provide an "error/warning" to them?
Weight should just change the number of hexagons (rather than 1 per row, use the sum of weight). This is because the same weight variable will be used when distorting the map with the cartogram, so the exact association between each hexagon and their geometry doesn't matter until later (the allocation step).
Answering your questions:
weight
. Since the distortion and number of hexagons are controlled by weight
, the optimisation of hexagons can be done as usual (just on different hexagons due to the effect of weight
).The cellsize
function is to calculate the proper cell size used to create the hexagon grid, it contains two parts:
Obtaining the cell size:
st_boundary
;sum(st_area(data))
;Using the cell size for a good hexagon grid result:
step 3
.Please correct or add to my comment!
Hi @ZIYAOWANG123, how are you going with this?
Hi Mitchell, I am still working on the cellsize
function.
I think I may provide some updates on the function this Wednesday or Thursday, and maybe arrange a time to discuss it with you.
Meanwhile, I will put the work-in-process onto the package GitHub as well.
Just a note that we have a meeting scheduled on this Thu 4pm and @ZIYAOWANG123 , you have a presentation to give on Tue Oct 4 Brown Bag Session.
The
cellsize
function is to calculate the proper cell size used to create the hexagon grid, it contains two parts:
- Obtaining the cell size:
- Calculate the total AU area (perhaps) by
st_boundary
;- Calculate the total AU land area by
sum(st_area(data))
;- Obtain the ratio = Total AU land area / Total AU area;
- Use the knowledge of knowing the number of hexagons we need for land area, to obtain the number of hexagons for the total AU area;
- Based on the number of hexagons of the total AU area, determine the proper cell size used to generate the hexagon grid.
- Using the cell size for a good hexagon grid result:
- A cell size will determine the number of hexagons for a given area (i.e. Total AU area)
- Need to know how cell size affects the number of hexagons for the area to control the number of hexagons to match the ratio from the last part
step 3
.Please correct or add to my comment!
While I was setting up the cellsize
function, the st_boundary
output the boundaries of the sf
object yet not the area (we wanted).
Here are two main questions:
How to obtain the whole area of AU?
After obtaining the number of hexagons for the whole area (i.e. using land/no. of hex_land = tot_area/no. of hex_tot), how to use this number of hexagons to alter the cell size in st_make_grid
? (We want the cell size which can produce the right number of hexagons.)
Additionally, I have checked the source codes of function st_make_grid
to see how actually cellsize
works in it. It shows a "strange" way of using it (c(diff(st_bbox(x)[c(1, 3)]), diff(st_bbox(x)[c(2, 4)]))/n
, where x = data set; n is an indicator with a default of n = c(10, 10)
). But, I didn't find out how to use the known/wanted number of hexagons (i.e. the output of st_make_grid
) to adjust/alter the cellsize
parameter.
Any suggestions and help will be appreciated! @mitchelloharawild @emitanaka
INPUT: SF geometry, OUTPUT: SF hexagon tiled geometry
Here's some code I've just written to experiment with tiling polygons with a specific number of hexagons. I think this will help you with writing some code/algorithm to produce better hexagon maps with a specific number of hexagons.
Unlike
geogrid::calculate_grid()
, this method deterministically chooses hexagons from a tiling based on their overlapping area. This should give a better and more consistent tiling instead of randomly sampling them.Map of Australian states
Tile the map with hexagons
Not enough hexagons, how do we set the appropriate grid dimensions?
Find the number of hexagons that overlap Australia, and set it to optimise it to the desired number of hexagons.
126 hexagons cover australia with cellsize = 3
We need more, so decrease cellsize slightly
153 hexagons cover australia with cellsize = 2.7
Now pick the best 151 hexagons based on maximising their overlapping area with Australia
A better result is probably achieved with a smaller grid:
Next steps:
Created on 2022-09-08 by the reprex package (v2.0.1)