benjann / geoplot

Stata module to draw maps
MIT License
28 stars 3 forks source link

bins not adjusting if `if` condition used #7

Closed asjadnaqvi closed 1 year ago

asjadnaqvi commented 1 year ago

See example below for France NUTS3 regions:

geoplot ///
    (area nuts3_shp y_MEDAGEPOP if CNTR_CODE=="FR", level(8))  ///
    (line nuts1_shp if CNTR_CODE=="FR", lc(black) lw(0.1) ) ///
    , legend(pos(2))  

The bins should adjust to the range of values for France.

geoplot_test3

benjann commented 1 year ago

I believe this is now fixed.

asjadnaqvi commented 1 year ago

I checked and the issue still persists for France only! Other countries are working fine. This is because France has an island territory FRY50 that exists in the data but has been dropped from the shapefile.

image

So the cut-offs are being generated based in the main data file, but plotting is limited to areas available in the shapefile. One would be to merge check data against the shapefile and discard entries for which boundaries don't exist.

benjann commented 1 year ago

oh, interesting problem; yes, I guess I should make sure that only units are used which also appear in the shape file

benjann commented 1 year ago

It is clear to me how this can happen when working with linked frames (the new approach). However, it seems you are working on the shape file with aliases above (i.e. geoframe attach). In this case FRY50 should not be taken into account if it does not exist in the shape file. Or may it be that FRY50 does exist in the shape file, but the coordinates have been set to missing or so?

asjadnaqvi commented 1 year ago

I checked and it exists as an empty shapefile with just one row. image

This was probably left in there when cleaning up the shapefiles in QGIS. But single row shape should be flagged or dropped by spshape2dta. Another thing to keep in mind...

benjann commented 1 year ago

One possibility would be to provide some geoframe commands for common tasks such as removing empty shapes...

benjann commented 1 year ago

I modified geoplot such that it no longer (a) plots shapes of units that appear in the shapefile but do not exist in the attributes file (if plotting the attribute file in plottype area or line) and (b) excludes units that have empty shape data (again in area and line). Shape data is "empty" if there is only one row in the shape file for that unit and in that row all coordinate variables are equal to missing). The above issue should be solved by (b).

I probably should also exclude units with missing coordinates/centroids in the attributes file if such a file is plotted using point (or pcarrow etc). Currently, these units will still be used in point when categorizing colors or when determining the scale of markers. But I am not fully sure; this would deviate from the normal behavior of twoway scatter which uses all obs within if and in and does not exclude obs that have missings (so that their weights will be included in calculation of markers sizes)...

asjadnaqvi commented 1 year ago

This works now perfectly for areas at least:

geoplot ///
    (area nuts3 y_MEDAGEPOP if CNTR_CODE=="FR", level(8) color(, reverse))  ///
    (line nuts1 if CNTR_CODE=="FR", lc(black) lw(0.1) ) ///
    , legend(pos(2))  

geoplot_test3