r-spatial / spdep

Spatial Dependence: Weighting Schemes and Statistics
https://r-spatial.github.io/spdep/
122 stars 26 forks source link

Adding an implementation of the GSi hotspot statistic #109

Closed rwesterh closed 1 year ago

rwesterh commented 1 year ago

This is an implementation of the GSi hotspot statistic for analysing different geometric scales. This method is a variant of the well-known Gi* hotspot statistic.

rwesterh commented 1 year ago

Dear Roger. Finally, after sorting out a few things like using 'F' instead of 'FALSE' in the Rd file, all the checks have gone through successfully. I am sorry if you have received some emails from the various pull requests.

rsbivand commented 1 year ago

OK, thanks, I'll merge now to make progress.

rsbivand commented 1 year ago

@rwesterh I'm looking at changing the planar projection in localGS.Rd to 26786, which has the same datum as the input shapefile (NAD27), but is in US feet not meters. Avoiding the datum transformation matters, to avoid searching for grids from the CDN if the CDN is not enabled. I realise that the problem would re-emerge if any user tried to display on an interactive webmap.

> sf::sf_proj_pipelines("EPSG:4267", "EPSG:26786")
Candidate coordinate operations found:  1 
Strict containment:     FALSE 
Axis order auth compl:  FALSE 
Source:  EPSG:4267 
Target:  EPSG:26786 
Best instantiable operation has accuracy: 0 m
Description: axis order change (2D) + Massachusetts CS27 Mainland zone 
Definition:  +proj=pipeline +step +proj=unitconvert +xy_in=deg +xy_out=rad
             +step +proj=lcc +lat_0=41 +lon_0=-71.5
             +lat_1=41.7166666666667 +lat_2=42.6833333333333
             +x_0=182880.365760732 +y_0=0 +ellps=clrk66 +step
             +proj=unitconvert +xy_in=m +xy_out=us-ft

but (abbreviated):

> sf::sf_proj_pipelines("EPSG:4267", "EPSG:32619")
hgridshift: could not find required grid(s).
pipeline: Pipeline: Bad step definition: proj=hgridshift (File not found or invalid)
...
hgridshift: could not find required grid(s).
pipeline: Pipeline: Bad step definition: proj=hgridshift (File not found or invalid)
Candidate coordinate operations found:  79 
Strict containment:     FALSE 
Axis order auth compl:  FALSE 
Source:  EPSG:4267 
Target:  EPSG:32619 
Best instantiable operation has accuracy: 1 m
Description: axis order change (2D) + NAD27 to WGS 84 (88) + UTM zone 19N 
Definition:  +proj=pipeline +step +proj=unitconvert +xy_in=deg +xy_out=rad
             +step +proj=push +v_3 +step +proj=cart
             +ellps=clrk66 +step +proj=helmert +x=2.478
             +y=149.752 +z=197.726 +rx=-0.526 +ry=-0.498
             +rz=0.501 +s=0.685 +convention=coordinate_frame
             +step +inv +proj=cart +ellps=WGS84 +step +proj=pop
             +v_3 +step +proj=utm +zone=19 +ellps=WGS84
Operation 31 is lacking 1 grid with accuracy 2 m
...

EPSG:2249 is a NAD83 planar alternative, but is also in US feet. Here only ballpark accuracy because of missing grids:

> sf::sf_proj_pipelines("EPSG:4267", "EPSG:2249")
hgridshift: could not find required grid(s).
pipeline: Pipeline: Bad step definition: proj=hgridshift (File not found or invalid)
...
hgridshift: could not find required grid(s).
pipeline: Pipeline: Bad step definition: proj=hgridshift (File not found or invalid)
Candidate coordinate operations found:  10 
Strict containment:     FALSE 
Axis order auth compl:  FALSE 
Source:  EPSG:4267 
Target:  EPSG:2249 
Best instantiable operation has only ballpark accuracy 
Description: axis order change (2D) + Ballpark geographic offset from NAD27
             to NAD83 + SPCS83 Massachusetts Mainland zone (US
             Survey feet)
Definition:  +proj=pipeline +step +proj=unitconvert +xy_in=deg +xy_out=rad
             +step +proj=lcc +lat_0=41 +lon_0=-71.5
             +lat_1=42.6833333333333 +lat_2=41.7166666666667
             +x_0=200000.0001016 +y_0=750000 +ellps=GRS80 +step
             +proj=unitconvert +xy_in=m +xy_out=us-ft
Operation 1 is lacking 1 grid with accuracy 1.5 m
Missing grid: ca_nrc_ntv2_0.tif 
...

What do you think? The distance bands would need to be re-cast for US survey feet I think. The underlying problem is needing to do:

> sf::sf_proj_network()
[1] FALSE
> sf::sf_proj_network(TRUE)
[1] "https://cdn.proj.org"
> sf::sf_proj_network()
[1] TRUE
> sf::sf_proj_pipelines("EPSG:4267", "EPSG:2249")
Candidate coordinate operations found:  10 
Strict containment:     FALSE 
Axis order auth compl:  FALSE 
Source:  EPSG:4267 
Target:  EPSG:2249 
Best instantiable operation has accuracy: 0.15 m
Description: axis order change (2D) + NAD27 to NAD83 (1) + SPCS83
             Massachusetts Mainland zone (US Survey feet)
Definition:  +proj=pipeline +step +proj=unitconvert +xy_in=deg +xy_out=rad
             +step +proj=hgridshift +grids=us_noaa_conus.tif
             +step +proj=lcc +lat_0=41 +lon_0=-71.5
             +lat_1=42.6833333333333 +lat_2=41.7166666666667
             +x_0=200000.0001016 +y_0=750000 +ellps=GRS80 +step
             +proj=unitconvert +xy_in=m +xy_out=us-ft
Operation 6 is lacking 1 grid with accuracy 2 m
Missing grid: GS2783v1.QUE 
...

with very good accuracy once the CDN is enabled. The pipeline listing checks on all candidates, but chooses the best feasible. We haven't chosen to activate CDN by default, because it has to use internet access.

rsbivand commented 1 year ago

Further, the function itself should only use "sf" objects to simplify. Should a user pass x as an sp object, coerce to sf immediately. Probably use of s2 should be tested too.

rwesterh commented 1 year ago

Thanks for your messages @rsbivand. I will look into those issues on the upcoming weekend.

rwesterh commented 1 year ago

@rsbivand Sorry for the delay in my response. I have now found some time to look into the code and to think about the issues that you raised. I will issue a pull request shortly to merge the changes into the main branch.

The difficulty with the latter is that it would be unclear whether a user is giving these values in metres (which would sometimes need to be converted to feet) or in feet (which might need to be converted to metres for some CRSs). Therefore, I think it would be easiest to just make it clear that the user is expected to provide the numbers in the relevant map units. An alternative approach would be to add another parameter "units" allowing users to specify in a string whether the function should work with "metre" or "US survey foot". The validity of this specification could be checked against the corresponding property of the CRS object of the layer. What do you think is the best approach to this?

rsbivand commented 1 year ago

No new parameter; handling proper "units" from units is complex and has the possibility of creating further downstream complications. I think stressing the user's responsibility for using relevant map units is sufficient.