eblondel / cleangeo

Cleaning geometries from spatial objects in R
https://github.com/eblondel/cleangeo/wiki
44 stars 2 forks source link

Running time for clgeo_Clean() #19

Closed fraba closed 1 year ago

fraba commented 7 years ago

I have executed clgeo_Clean() on a 6MB shapefile (you can download it from here: it's the Reg2011_WGS84 file with all features merged into one, which I renamed Ita2011_WGS84) and is still running after almost 24 hours.

This is my code:

sp <- readShapePoly('Ita2011_WGS84'); report <- clgeo_CollectionReport(sp); clgeo_SummaryReport(report); type valid issue_type rgeos_validity:1 Mode :logical GEOM_VALIDITY:1
FALSE:1
NA's :0
sp.clean <- clgeo_Clean(sp);

--- Want to back this issue? **[Post a bounty on it!](https://www.bountysource.com/issues/40676071-running-time-for-clgeo_clean?utm_campaign=plugin&utm_content=tracker%2F6407602&utm_medium=issues&utm_source=github)** We accept bounties via [Bountysource](https://www.bountysource.com/?utm_campaign=plugin&utm_content=tracker%2F6407602&utm_medium=issues&utm_source=github).
eblondel commented 7 years ago

By default, the strategy has been set to POLYGONATION, which is much better because it covers more geometry validity issues. But it is more time and memory consuming than the initial BUFFER strategy (which can be fast, but unfortunately is not accurate for all geometry validity issues). Still in experimental stage. As 1st step, you can try to change the strategy to 'BUFFER'. Processing time will be probably reduced, but your geometry might be more altered than expected. When i have spare time, i will have a look to your shapefile. Best

2017-01-06 6:04 GMT+01:00 fraba notifications@github.com:

I have executed clgeo_Clean() on a 6MB shapefile (you can download it from here http://www.istat.it/storage/cartografia/confini_amministrativi/non_generalizzati/2011/Limiti_2011_WGS84.zip: it's the Reg2011_WGS84 file with all features merged into one, which I renamed Ita2011_WGS84) and is still running after almost 24 hours.

This is my code:

sp <- readShapePoly('Ita2011_WGS84') report <- clgeo_CollectionReport(sp) clgeo_SummaryReport(report) type valid issue_type rgeos_validity:1 Mode :logical GEOM_VALIDITY:1 FALSE:1 NA's :0 sp.clean <- clgeo_Clean(sp)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/eblondel/cleangeo/issues/19, or mute the thread https://github.com/notifications/unsubscribe-auth/ABQ1bB-A_wI-ipytS6d3Uhoq341wU5Yyks5rPctIgaJpZM4LcaE0 .

eblondel commented 7 years ago

Hello @fraba, I've tried on Reg2011_WGS84 (In the ZIP, i didn't find any Ita2011_WGS84 shapefile). Given the high resolution of your polygon delineation, Indeed POLYGONATION (although more accurate method to fix issues) is going to be very long process. This strategy is a new experimental one i've set in cleangeo to cover more geometry issues, but still i have to think if/how the algorithm could be improved in performance.

In your case, i've tried to apply a basic BUFFER strategy, and your shapes seems well adapted to it. You should not face performance issues with it:

#packages
require(sp)
require(maptools)
require(cleangeo)

#load data
sp <- readShapePoly('Ita2011_WGS84', proj4string = CRS("+init=epsg:4326"))

#geometry report
report <- clgeo_CollectionReport(sp)
clgeo_SummaryReport(report) #report gives 2 validity issues

#using buffer strategy
#==============
system.time(sp.clean.buffer <- clgeo_Clean(sp = sp, strategy = "BUFFER"))
#system time
#----------------------
# user  system elapsed 
# 1.960   0.004   1.959

gIsValid(sp.clean.buffer)

#look at plots for detecting possible geometry alteration issues or holes issues
plot(sp[12,])
plot(sp.clean.buffer[12,], add = TRUE, col="lightblue")
plot(sp[20,])
plot(sp.clean.buffer[20,], add = TRUE, col="lightblue")

Let me know

eblondel commented 7 years ago

Hi @fraba did you have time to look at the above test?