unbrother / traffic-major-minor

GNU General Public License v3.0
4 stars 1 forks source link

Road importance #3

Open unbrother opened 4 years ago

unbrother commented 4 years ago

The road importance part of the postGRE code first makes subdivisions enclosed by the major roads. I am not sure how to do this in R, I believe I would just need to draw a polygon between those lines but do not know how to make R select the enclosed areas. After that they made something different for roads outside of these areas, which they name coastal roads. I have not checked this in full but I am sure it will be a challenge for me.

Until now, I used dodgr with weight_streetnet() to get some kind of relative weighting and I was considering that as the road importance. But this is not the process originally used.

@mem48 and @Robinlovelace, do you have any pointers or ideas on how to attack this part?

Robinlovelace commented 4 years ago

These lines of code may help: https://github.com/cyipt/popupCycleways/blob/master/code/build.R#L129-L132

The rnet object is the equivalent of the major roads and it's finding which elements in r_main_region are inside a 10m buffer around those major roads.

mem48 commented 4 years ago

@unbrother

I had an idea overnight so I've committed some extra code.

routing from everywhere to everywhere is the same as edge betweenness centrality. And dodgr has a new function to do this directly. This removed all the faffing making the flow matrix.

But for some reason the results are lost when converting back to sf object (see https://github.com/ATFutures/dodgr/issues/138) to I've written a little function that uses stplanr::overline2 https://github.com/unbrother/traffic-major-minor/blob/master/R/dodgr_central.R this runs quickly and give good results.

image

mem48 commented 4 years ago

I did a quick spatial join between the traffic points on the minor road and the minor road centrality

image

The are a couple of outliers to investigate but broadly looks good

unbrother commented 4 years ago

@mem48 I am getting a different centrality map when running the 02c code.

image

Also, I can't get past line 169 as I don't have the object "gsf".

# Match Up osm_minor_mod with gsf -----------------------------------------
summary(unique(gsf$dat.way_id) %in% unique(osm_minor$osm_id))

I suppose it is created in the sfnetwork_test file but I get an error

> net <- as_sfnetwork(osm_minor)
Error in `[[<-.data.frame`(`*tmp*`, i, value = c(1L, 1L, 4L, 5L, 7L, 8L,  : 
  replacement has 4060 rows, data has 4095

Not sure about either why I am getting different results. I restarted R and have even updated to version 4.

mem48 commented 4 years ago

@unbrother sorry my mistake I forgot to commit my final changes, try now.

I think you are getting the same result, but I adjusted the colour bands on the plot I posted. The code should be in the latest commit

mem48 commented 4 years ago

The sfnetwork package doesn't work with roundabouts. Which is what causes the error you got. I think that it might be a dead end until they fix this bug https://github.com/luukvdmeer/sfnetworks/issues/59

Robinlovelace commented 4 years ago

Heads-up: that issue is fixed in the development version. See here: https://github.com/luukvdmeer/sfnetworks/issues/59#issuecomment-647760605

remotes::install_github("luukvdmeer/sfnetworks", ref = "develop", quiet = TRUE)

There are other issues you should consider when using that approach, as documented here: https://osf.io/wtqas/

unbrother commented 4 years ago

@mem48 After using the code with auto breaks, I get the same numbers for centrality, but can't get the code to go further because of the gsf object. Previously, we created this object by using dodgr results on flow, and it kept the osm_id so it was easier to join with the network

imp_score <- st_drop_geometry(gsf[,c("dat.way_id","dat.flow")])
osm_minor <- left_join(osm_minor, imp_score, by = c("osm_id" = "dat.way_id"))

As we don't have that now, would an st_join work correctly? This is the "graphs" object: image Or am I missing some other step?

unbrother commented 4 years ago

I managed to do it like this

# Match Up osm_minor_mod with gsf -----------------------------------------
summary(unique(graphs$geometry) %in% unique(osm_minor$osm_id))

osm_minor <- st_join(osm_minor, graphs, left = TRUE, largest = TRUE)
osm_minor$centrality[is.na(osm_minor$centrality)] <- 0

tm_shape(osm_minor) +
  tm_lines(col = "centrality", lwd = 3, style = "jenks")

# Find the AADT on the major road ofr each junction point -----------------
junc_majmi <- st_as_sf(junc_majmi, coords = c("X","Y"), crs = 4326, remove = FALSE)
osm_major <- st_transform(osm_major, 4326)
junc_majmi <- st_join(junc_majmi, osm_major[,"aadt"])

osm_minor$aadt2 <- junc_majmi$aadt[match(osm_minor$nearst_junction ,junc_majmi$from_id)]
qtm(osm_minor, lines.col = "aadt2", lines.lwd = 3) # Relative flows number

osm_minor <- osm_minor %>% st_drop_geometry() %>% dplyr::select(osm_id, centrality, aadt2)
osm_aadt_maj <- left_join(osm %>% as.data.frame(), osm_minor %>% as.data.frame(), by = "osm_id")
osm_aadt_maj <- osm_aadt_maj %>% st_sf(sf_column_name = 'geom')

for(i in 1:nrow(osm_aadt_maj)){
  if(is.na(osm_aadt_maj$aadt2[i])){
    osm_aadt_maj$aadt2[i] = osm_aadt_maj$aadt[i]
  }
}

qtm(osm_aadt_maj, lines.lwd = 3, lines.col = "aadt2")

Now my idea is to get centrality for the whole network, subset only major roads and join the minor and major roads objects together (wuth centrality). I tried doing the current process for minors but with majors but since I get a less dense network, I get centrality values that are very low (this might work but I have doubts).

To estimate centrality for the whole network, I am using the coding inside the dodgr_central() function, and I get a graph result which seems correct, but when I try to join this object with the osm (or any other network) I get no joins. Seems like the geometries do not match.

major_centrality <- weight_streetnet(osm, wt_profile = "motorcar")
major_centrality <- dodgr_centrality(major_centrality)

major_cent <- dodgr_to_sf(major_centrality)
major_cent <- stplanr::overline2(major_cent, attrib = "centrality")

#g = major_centrality
g <- dodgr_centrality(major_centrality)
major_geom <- major_centrality[,c("from_lon","from_lat","to_lon","to_lat")]
g <- major_centrality[,!names(major_centrality) %in% c("from_lon","from_lat","to_lon","to_lat","from_id","to_id")]
major_geom <- as.matrix(major_geom)
major_geom <- split(major_geom, 1:nrow(major_geom))
major_geom <- lapply(major_geom, function(x){
  sf::st_linestring(matrix(x, ncol = 2, byrow = TRUE))
})
major_geom <- sf::st_as_sfc(major_geom)
g$geometry <- major_geom
g <- st_as_sf(g, crs = 4326)
g <- stplanr::overline2(g, attrib = "centrality")

image I just get NAs image I´ve commited this test as major_centrality_test

After this, I can attach the categorical variables roadtype (major/minor) and aretype (urban/rural) to start trying the glm.

mem48 commented 4 years ago

@unbrother I've cleaned and updated my code

It should now run from end to end and make the following plots

image

image

image

image

image

image

mem48 commented 4 years ago

I get a model to predict traffic with an r squared of 0.35 so not great, but a start. Now the task is to work out how to improve those values. I'd definitely try to fix those two points with very high centrality but low traffic, I expect the weighting of the roads might be the problem.

mem48 commented 4 years ago

Intresting paper to look at https://www.sciencedirect.com/science/article/pii/S096669231930568X

unbrother commented 4 years ago

@mem48 I added the areatype variable and get and r squared of 0.65, but still not convinced by the predictions.

# Assign area type (urban and rural) ---------------------------------------

# Import Strategi shp

strategi <- st_read("data/strategi/urban_region.shp") #Find if this can be obtained directly
strategi <- st_transform(strategi, 4326)

# Classify as urban or rural

traffic_areatype <- st_join(traffic_minor, strategi, by = c("geom" = "geometry"))
traffic_areatype <- traffic_areatype %>%
  mutate(areatype = ifelse(LEGEND %in% c("Large Urban Area polygon"), "urban", "rural"))

# log model with added variable
m2a <- lm(log(aadt) ~ log(centrality) + log(major_aadt) + areatype, data = traffic_areatype)
summary(m2a)
plot(traffic_areatype$aadt[!is.na(traffic_areatype$centrality)], exp(predict(m2a)))
abline(0,1, col = "red")

Then I tried the poisson model which replicates the same family used in the paper and get similar results

# log model poisson
m3 <- glm(aadt ~ log(centrality) + log(major_aadt) + areatype, data = traffic_areatype, family = "poisson")
summary(m3)
plot(traffic_areatype$aadt[!is.na(traffic_areatype$centrality)], exp(predict(m3)))
abline(0,1, col = "red")

I am not entirely sure if having more data would help the model, but if so, I believe having major roads in the set is important. But now I am a little confused on how to add both sets together, I am working in a file named major_centrality_test but still with limited results. Since the treatment for osm_minor and osm_major has been different, I am having trouble finding where some variables get added (or dropped).

unbrother commented 4 years ago

This is the current version of the code that tries to process major roads.

major_centrality <- weight_streetnet(osm, wt_profile = "motorcar")
major_centrality <- dodgr_centrality(major_centrality)
major_centrality <- merge_directed_graph(major_centrality)
clear_dodgr_cache()
major_centrality <- dodgr_to_sf(major_centrality)

summary(major_centrality$centrality)
tm_shape(major_centrality) +
  tm_lines(col = "centrality", lwd = 3)

graphs_major <- left_join(major_centrality,
                    st_drop_geometry(osm_major[,c("osm_id","aadt")]),
                    by = c("way_id" = "osm_id"))
graphs_major <- graphs_major[graphs_major$highway %in% c("motorway","motowray_link","primary","primary_link","trunk","trunk_link"),]
colnames(graphs_major)[which(names(graphs_major) == "aadt")] <- "major_aadt"

traffic_major <- traffic[!traffic$road %in% c("C","U"),]

traffic_major <- st_buffer(traffic_major, 33)
traffic_major <- st_transform(traffic_major, 4326)

traffic_major <- st_join(traffic_major, graphs_major)

It is similar to that from minor without the nearest junction part (I just assumed that the "aadt from nearest major road" is its own. But when I try to join traffic counts with the results from these lines I get all NAs. I tried changing the buffer but it does not help. image

I tried to plot it and apparently somewhere I am creating polygons instead of lines.

qtm(traffic_major, lines.col = "aadt", lines.lwd = 3)
Error: traffic_major consists of polygons, so it cannot accept tm_lines.
mem48 commented 4 years ago

Ok I think you need to check the docs on joins, for example

traffic_areatype <- st_join(traffic_minor, strategi, by = c("geom" = "geometry"))

doesn't need a by argument as it is a spatial join and is joining objects that st_intersect

dplyr::left_join

Is an attribute join and needs aby argument becuase it is joining two tables by a shared ID value

mem48 commented 4 years ago

for

traffic_major <- st_join(traffic_major,graphs_major)

You are joining the second argument onto the first argument so the geometry of the fist is kept. THis is why you get the tmap error. I've switched the order to make it work.

I also had to add a

osm <- st_transform(osm, 4326)
osm_major <- st_transform(osm, 4326)

to get your code to work on my computer

mem48 commented 4 years ago

Be careful when joining for creating duuplicates. For example when you have two overlapping buffers and join, you might end up with a road being duplicated

mem48 commented 4 years ago
> qtm(traffic_major, lines.col = "aadt", lines.lwd = 3)
> summary(duplicated(traffic_major$geometry))
   Mode   FALSE    TRUE 
logical     629     903 

image

mem48 commented 4 years ago

I also added cor tests to you results

> cor(traffic_minor$aadt[!is.na(traffic_minor$centrality)], predict(m1))
[1] 0.6065638
> cor(traffic_minor$aadt[!is.na(traffic_minor$centrality)], predict(m2))
[1] 0.5903713
> cor(traffic_areatype$aadt[!is.na(traffic_areatype$centrality)], exp(predict(m2a)))
[1] 0.7434034
> cor(traffic_areatype$aadt[!is.na(traffic_areatype$centrality)], exp(predict(m3)))
[1] 0.8007769

So you're clearly making progress

mem48 commented 4 years ago

If you think you need more data, you could try running the code on a larger area e.g Leeds. The Isle of Wight is good for testing as it is very small, but it not a typical place, and you now seem to have the basics working.

I'd suggest reformatting the code a little so that you can run the timeconsuming bits once and then save the results. You will want to play around with different models and variables, and you don't have to recalculate centrality every time.