Closed cgauvi closed 2 years ago
Looks right to me. Some differences because 2021 geographies on CensusMapper have been clipped to remove larger waterways.
(From GeoSuite)
Hum.. the map above isn't cartographic though? Maybe this mapview screenshot will make it clearer that the boundaries are different
Ah, I see it now. Thanks. Yes, that's a problem, looks like the "island" got removed because it's too small. Part of the problem of trying to balance faster mapping and download times with accuracy. I will have to think about how to best handle this.
One option is to reduce the simplification for the CDs on CensusMapper, another is to separate out mapping geometries from cancensus geometries. CensusMapper retains high-resolution geographies, but right now they aren't getting accessed via cancensus.
Another way to handle it is to add an option to cancensus to ask for high-resolution geographies. Which will significantly increase download times and load on the server, but maybe ok if it's a option that people can turn on for the times they want higher level geographies like CDs at high spatial resolution/low simplification.
Thoughts?
I think the last option (option for full-resolution geographies) is the best: most flexible and meets all needs. I also suspect this might be easier to implement and with the caching, I think it makes sense to have the opportunity to download higher resolution datasets. It also follows some of the standards in other related packages.
c.f. the tigris::counties
documentation
cb If cb is set to TRUE, download a generalized (1:500k) counties file. Defaults to FALSE (the most detailed TIGER file).
Ok, will add that to my to-do list. Leaving this issue open until I have a fix for this.
This will get fixed in the upcoming release of cancensus. It adds the ability to download high-resolution geographies, and I have also uploaded a slightly less simplified version of the geographies to CensusMapper as I think I went a little too far.
The following code shows what the new geographies look like:
c("simplified","high") |>
lapply(function(resolution)
get_census("CA21",regions=list(CD=2466),geo_format = 'sf',use_cache = FALSE, resolution = resolution) |>
mutate(resolution=resolution)) |>
bind_rows() |>
ggplot() +
geom_sf() +
facet_wrap(~resolution) +
coord_sf(datum=NA)
The new version of the package also allows for recalling census data, emitting a warning when recalled data has been cached or is being used, and a convenience method to remove recalled locally cached data. I will recall the geographies mid next week after doing some more testing, and (hopefully) having the new version of the package up on CRAN.
This is very cool!
Would it be possible to calculate or estimate the size of the download at the time of the API call for different resolution levels? If someone pulls every DA in Canada at high resolution, we should warn them first. If this is harder to implement, then we can think about how to set a healthy default and steer users to safety in the documentation.
The default is still the simplified geographies at each level. At the DA level there is no (or very little) difference between simplified and high resolution. One thing I have done server side is slapped an API point penalty on getting high resolution geographies, so if someone is requesting high resolution DA for all of Canada the server will just sent an error that saying the user does not have enough API quota. In practice, only users with advanced API privileges will ever have the opportunity to download lots of high-resolution geographies, and those users probably know what they are doing. So don't think this will be much of an issue in practice. Although might need better documentation.
Looks great! Thanks for taking the time to look into this.
Hi, thanks again for all the great work, cancensus is massively convenient. I'm having some issues with polygon boundaries recently. Here's one problem for the 2021 data
This boundary is wrong and is different from the one I get on the Stats can website
I suspect some combination of polyon simplification & water removal code is creating the issue: Montreal seems to get cut off at the canal Lachine.
From my
renv.lock