IDEMSInternational / R-Instat

A statistics software package powered by R
http://r-instat.org/
GNU General Public License v3.0
38 stars 103 forks source link

Add dialogue for spatial data into R-Instat #6665

Open rdstern opened 3 years ago

rdstern commented 3 years ago

This follows our aim of being able to use ggplot2 for the "standard" graphs in the Data Visualisation with R book. There Chapter 6 is on maps. It has 2 sections, namely ordinary maps using ggmap and choropleth maps with the choroplethr package. In addition I assume we want contour plots.

We also (perhaps more urgently) need to be able to produce heatmaps. They can be a further option on the same dialogue. Simple versions of these "maps" use either geom_tile, or geom_raster. It is highly convenient to use geom_raster on the same dialogue, because raster type data is often mapped.

In base R there are heatmaps that are together with dendrograms. There is an attractive version of this in ggplot through the heatmaply package - that also links with plotly. This link with cluster analysis is interesting and common examples use a data matrix with the rows and columns forming the data. The heat maps we want usually have (instead) an x and y - often factors - together with a numeric variable that is being coloured differently depending on its value. This isn't so different, because we can turn the many variables into a single variable just by reshaping the data. However, I suggest we reserve the graphs from the heatmaply package for a cluster analysis dialogue, or (at least) we include them in the multivariate menu. And, in this dialogue we have simple heatmaps. The guide here called "Creating easy heatmaps in ggplot2" may be useful. Here is an example:

image

By the way choropleth maps are essentially heatmaps based on lat and long, so that works fine on the same dialogue.

So the dialogue might have buttons map/choropleth/contour/heatmap/

Details still need some thought. I am not sure if the mapping dialogue in climatic can be used?

N-thony commented 3 years ago

@rdstern what about this issue? Is there someone working on it or I should take it over?

rdstern commented 3 years ago

The heatmap is now just about working. Here it is in R-Instat: image

I have sent you the data file: Here it is again S Sudan example with heatmap.xlsx

This also shows you the same data as a heatmap in Excel - that's on the first sheet. Like so: image

And here it is, in R-Instat in the DEscribe > 2 variables > pivottable dialogue:

image

What do you notice in the comparisons. I can see a lot of scope for improvements to this implementation. Not before merging the current version, but for later.

a) The scales are nicer in Excel or the R-Instat pivot table than any of them in the current R-Instat. A useful quick-win would be to extend the range of scales that we can use. In particular these stay much lighter, so all the numbers can be seen. b) The graph in Excel is in sorted order of both the farmers and the treatments. We can now do this sorting in the bar chart dialogue. Useful to add those controls here too. They might need adapting. It is not sorted on frequencies, but on a variable, see the bar chart with the data option. c) There are margins given. That will need a bit more R here. d) The labels automatically format the numbers - I'd like that option for our labels here.