ipeaGIT / r5r

https://ipeagit.github.io/r5r/
Other
183 stars 31 forks source link

detailed_itineraries() with results per segment? #298

Open temospena opened 2 years ago

temospena commented 2 years ago

Hi all,

We are using r5r to estimate routes with cycling potential, to support decisions on the cycling infrastructure priorities, and to point to which parts of the existing road network are not safe enough for cycling, in a planning perspective that those should have more attention to invest for safer cycling.

The detailed_itineraries() function is awesome to model all the (potential) routes between an OD dataset. The results come in 1 row per route per used mode. The street_network__to_sf() function allows to export all the edges that r5r is using for this routing problem, including the osm_id (id in this case) and bicycle_lts level tags.

Once that we have this information, it should be possible to get more information for every route, listing all the road network segments that were used for that route. And with that information, we could assess LTS levels per segment, and see which are quiet and not, for this purpose.

We know that CycleStreets routing function retrieves the results 1 per segment, which is very useful.

Would it be possible to add a option argument to detailed_itineraries() to retrieve the results per segment, for the routes with mode = "BICYCLE"? I can imagine that it wouldn't be the default, since it increases drastically the results object size, but this would be very handful, once we don't have a good solution to match the r5r results with the street_network results.

This is something that me and @Robinlovelace are struggling with, for different projects.

Thank you in advance!

Robinlovelace commented 2 years ago

For an example of what per segment route outputs look like, see the example here: https://rpackage.cyclestreets.net/

This will be really useful for our work, there are work-arounds but they are a pain. Not sure if it's even possible on the R5 side looking at the issues. One interesting note from my colleague @Hussein-Mahfouz: in R5 terminology a segment is a leg of a journey made by a single mode, right? I think it's more conventional and useful to reserve segment for the linestrings of which transport networks and routes along them are composed, analogous to OSM ways. Interested to learn about other types of terminology, especially if they are used a lot in practice, and not 100% sure how the terms segment and leg are used in R5 and other routing engine documentation.

rafapereirabr commented 2 years ago

Hi Rosa and Robin. Thanks for the suggestion. As Robin said, there is no easy and efficient way to do this from the R side. I will check with @mvpsaraiva if perhaps this is something that could be done on the Java side based on the output created from R5.

Regarding the terminology, I totally understand your point. We use the term "leg" a lot in the documentation of the detailed_itineraries() function and I believe the term 'segment' or 'trip segment' would be more appropriate in some of those cases. I'll look into that, but please feel free to suggest any particular changes you see fit

Robinlovelace commented 2 years ago

Regarding the terminology, I totally understand your point. We use the term "leg" a lot in the documentation of the detailed_itineraries() function and I believe the term 'segment' or 'trip segment' would be more appropriate in some of those cases. I'll look into that, but please feel free to suggest any particular changes you see fit

When the documentation refers to single mode routes, composed of multiple short segments on the network with different attributes such as LTS, I think leg is the most appropriate term. My recollection is that 'segment' is used in R5 docs to refer to these legs. If so replacing 'segment' with 'leg' would be my recommendation.

mvpsaraiva commented 2 years ago

Hi all. Indeed, this is a very good idea, that can make r5r more useful in many new situations. We are kind of lucky with the timing of this feature request, because I did a complete rewrite of the Java code of detailed_itineraries() for the future release of r5r 1.0, and I think that those changes can make it possible to implement this feature now. I'll get back to you once I have some news.

anastassiavybornova commented 11 months ago

Hi all! I stumbled upon this enhancement issue on my quest for a straightforward way of matching the routing output of r5r's detailed_itineraries() to OSM edge IDs. As far as I understood, detailed_itineraries() returns the edge geometries, but not the OSM IDs of the edges (which we would very much like to have). I wonder if that is the same thing that you mention having been struggling with @Robinlovelace and @temospena ? If so, would be curious to hear your way of tackling this! And apologies if this is off-topic, I'm still new to r5r and figuring things out.

Robinlovelace commented 11 months ago

Yes I think this is the same issue that you're hitting @anastassiavybornova! Another option is to do post-routing join with OSM. Having the OSM IDs directly from the routing engine would help massively.

rafapereirabr commented 11 months ago

I believe @luyuliu has already sorted out a way to extract OSM ids. So perhaps it would be possible to include a column with these its in the output of detailed_itineraries()

luyuliu commented 11 months ago

Hi all,

I created a fork here: https://github.com/luyuliu/r5r with the requested feature for my personal usage. Now detailed_itineraries() can output tab-based csv with a list of each segment's OSM id as well as the boarding and alighting stops' index in the GTFS stops.txt file for each transit leg. I also change the code so that every link in the network is included in the street_network__to_sf() function; notice that current version will not include the new links that connecting origin/destination and the network.

You can recompile the java project and change the jar to the original jar in the r5r folder. I will make a release later.

Thanks, Luyu

rafapereirabr commented 11 months ago

Hi @luyuliu . this is great. Could you please paste a snaphot of the output so we can see how it tlooks like?

I'll have a meeting with the rest of the r5r team today and I'll discuss with them what changes you have implemented that we could merge into r5r. Cc @mvpsaraiva @dhersz

luyuliu commented 11 months ago

Sorry for the later response! Please find a csv below that shows the specifics of an OD pair. from_87_532.csv

Note that I changed the csv format from comma delimited to tab delimited. Hope this helps!

luyuliu commented 11 months ago

Oops, that csv only contains one trip leg. Please find another one below. from_203_157.csv

rafapereirabr commented 11 months ago

Thanks @luyuliu ! From what I see in the file from_203_157.csv, you have added three new columns to the output:

This looks good to me. My only concern is that adding these columns, in particular edge_id_list, would make the function much slower due to the size of the output in memory. Hard to antecipate the impact on performance because it would vary a lot on a case by case basis. This is not a huge problem per se, but perhaps these columns should not be returned by default. One option would be to have a new boolean parameter, something like osm_link_ids. It would default to FALSE, but users could set it to TRUE if they wanted the output to include the "edge_id_list" column.

Glad to hear more opinions on this from the other developers and users.

luyuliu commented 11 months ago

Yes that sounds good to me. This is pretty easy to implement; just add another parameter in the function and a condition clause.

Meanwhile, I would like to remind you again that both the board_stop_index and alight_stop_index are, literally, the index of board/alight stop's index in the list of stop in GTFS stops.txt starting from 0. This may confuse people if not explicitly indicated in the documentation. Meanwhile, I think it would be much better if anyone knows a way to return the actual stop_id. However, I don't know how to call the list of stops in the function due to the encapsulation rules of java. Maybe need to write another sets of getter in the code but I just didn't bother to do that.

luyuliu commented 11 months ago

Some updates on the board_stop_index and alight_stop_index. I did some tests on the lastest batch again and found some inconsistencies in the results. I assumed the two indexes are the integer index of the engine's internal stop table, which is returned in TripLeg.java as boardStop and alightStop. I thought it would follow the same sequences as the input stops.txt file, and I can find the actual stop_id according to the index, which I did in the prior test when using bus stops as origin and destination.

However, I found that the assumption won't hold if changing the O/D to different points other than the bus stops. And now I cannot interpret the returned indexes. My guess is that the new added O/D points are somehow disturbing the sequence of the stops; however, again, it's a black box to me about what's happening in the r5 engine honestly.

I noticed a hash map in src\org\ipea\r5r\Network\TransitLayerWithShapes.java indexForUnscopedStopId, which is the structure that stored the relationship between the internal integer-based ID and actual stop_id in GTFS (which is also the case in src\main\java\com\conveyal\r5\transit\TransitLayer.java). However, as always, I cannot easily find a way to access the structure in TripLeg.java. I may even need to rewrite some of the r5 codes if I need to access this info. I am wondering if there is a bypass or trick that find the original GTFS stop_id (a list or a hashmap) in TripLeg.java?

cc @mvpsaraiva. Thanks in advance.

luyuliu commented 11 months ago

Yet another update on the board_stop_index: I took a while to read the code and managed to work it out, so my prior questions are moot. I found the stopIdForIndex variable in conveyal\r5\transit\TransitLayer.java, so I can access the scoped GTFS stop information and assign it to the csv. See an example below.

from_110_187.csv

rafapereirabr commented 8 months ago

Hi @luyuliu , sorry for my slow response.

Do you think it would be possible to include in the detailed_itineraries() a new boolean parameter osm_link_ids like we discussed above? So that when osm_link_ids = TRUE, than the output would include these three columns:

luyuliu commented 8 months ago

That's a good idea but I may need to think about how to do this; I haven't touched the Java code since forever. Just spent a whole day catching up with the Java 21 version, but seems like the a lot of dependencies on the Java side have changed since last update. I eventually chose to stay with v 1.0.0 and Java 11 for now.

The primary issue I am facing now is that there is a class in Kyro Jar called com.esotericsoftware.kryo.util.DefaultInstantiatorStrategy that's missing when I use the old JRI.jar file. And I couldn't find the location of new JRI.jar file in the r5r installation folder when I tried the latest developer version, if not there is no such a file any more. The location of r5 jar also changed too. I don't want to make the pull request yet before I can successfully pass the test on my end in case I break anything; but I can make the new osm_link_ids field happen in Java 11 and pull request, and you can decide if that's okay to merge. Any suggestion would be useful.

Thanks, Luyu

rafapereirabr commented 8 months ago

Ah, I see. In this case, perhaps we should first make progress with the new r5r v2.0.0 using Java JDK 21 and the new functions to store / chache R5. Once we do that, we can come back to this issue. What do you think?

luyuliu commented 8 months ago

Good idea; thanks for the suggestion. I don't imagine it would have any impacts on the code I will be revising, but again, just the testing environment on my side. I will let you know once I have more to share.

luyuliu commented 1 month ago

Hi, just want to bring up this issue again for I have been working on migrating my old code from pre-2.0 era. I have been able to successfully replicate the exported csv files with tab delimitator and detailed boarding and alighting stop information and OSM id system. It's a surprisingly smooth process. I am using the r5-v7.1-all.jar for R5 dependencies.

An example from Jacksonville, Florida. from_0_to_99.csv

I am wondering if the edits are ready to be incorporated? Or we can wait.

rafapereirabr commented 1 month ago

Hi @luyuliu. This looks good to me. Could you please open a PR. Since Conveyal has released R5 v7.2, I would like to test it with R5 v7.2.

luyuliu commented 1 month ago

Just create a PR to the main branch. Let me know if this is okay.

I haven't been able to implement your suggestions earlier about using parameters to set the output fields in the csv file. It would involve the changes in both R and Java side, which is not hard to do after a quick look, but I do not know how to debug and test on my end. Let me know if you can help me on that part.

rafapereirabr commented 4 weeks ago

Thanks @luyuliu ! @mvpsaraiva is looking into a few priority issues on the Java side and we'll have a look at your PR right after that. Thanks for your patience!