Closed flashton2003 closed 6 years ago
@sjewo It looks that the issue is caused by zeros in the dataset. It works well when all the values are > 0:
afr_tb$TB_cases <- afr_tb$TB_cases + 1
afr_tb_cont <- cartogram_cont(afr_tb, "TB_cases", itermax = 10)
tm_shape(afr_tb_cont) + tm_polygons("TB_cases", style = "cont") + tm_layout(frame = FALSE)
Thank you!
It is difficult to calculate the distortion for very small or large values. The default strategy is to raise the lowest and shrink the largest values. The Parameter threshold defines a quantile and by default all values below the 5th percentile are adjusted.
Your data has a lot of zeors, so the 5th percentile is zero too. Just raise the threshold to get a better adjustment:
afr_tb_cont <- cartogram_cont(afr_tb, "TB_cases", itermax = 10, threshold=0.1)
I'll add a warning in the next release, to print a message if the adjusted values are still zero.
This is the first time in my experience that countries having no TB was a bad thing :-)
Thanks for the explanation.
I have the same issue. Playing with your README
example, except using the Americas instead of Africa:
library(cartogram)
library(tmap)
library(maptools)
data(wrld_simpl)
table(wrld_simpl$REGION)
x <- wrld_simpl[wrld_simpl$REGION == 19 & wrld_simpl$POP2005 > 0, ]
x <- spTransform(x, CRS("+init=epsg:3395"))
x_cont <- cartogram_cont(x, "POP2005", itermax = 5)
tm_shape(x_cont) + tm_polygons("POP2005", style = "jenks") +
tm_layout(frame = FALSE)
Result:
Could it be due to the bounding box?
Your example is a tough problem for the algorithm: Canada and Greenland have the largest areas and a rather small population. The great number of fjords and islands aren't helpful either...
After 100 iterations the scaling looks a little bit better:
Maybe you could try another distortion algorithm, like this ArCGIS plugin: https://www.arcgis.com/home/item.html?id=d348614c97264ae19b0311019a5f2276
@sjewo Thanks for the instructive explanation, and suggestion to increase the iterations.
Your algorithm works well, I'll just be more patient next time, and work with a large number of iterations. It's also helpful that you offer two stopping rules (error size and max # iterations).
Thanks for making the cartogram package available. I'm trying to make a cartogram of TB burden in Africa (the whole world is my goal, Africa just a stepping stone) and the cartogram is giving output which does not scale by the burden. The range of TB burden is about 438000 to 0. The output for this, run with 500 iterations is below, the problem is that there is no scaling of the country area by TB burden:
Here is the code I used to generate this:
And a link to the tb_burden.tsv for replication.
I have also tried dividing the TB burden by 1000, with the same results. I log transformed the TB burden, which gave this result, which doesn't make sense because South Africa and Nigeria should be largest (highest burden):
Any help would be greatly appreciated.