Closed Arilith closed 8 months ago
Hi and welcome to OTP.
I feel that this is definitely a data error and should be fixed in the GTFS-RT feed. Why does the feed contain stops that the train does no stop at? SKIPPED means that there was a plan to stop at this stop but for some reason that won't happen anymore. How is OTP supposed to tell the difference between a genuinely skipped stop and your example?
Whilst the preferred option is to fix the RT input, dealing with messy feeds is just a fact of life that we as the OTP developer often have to deal with.
We decided a while ago that we are ok with fixing the data in OTP itself as long as we make it explicit, visible and no longer automagically, perhaps through some kind of opt-in clean-up pipeline.
If you're willing to help out with this, we would love you to join our dev meetings twice a week or the Gitter chat.
In any case, even if you can't work on it, you can join as it would be good to have someone from the Dutch OV project talk to.
@leonardehrenfried this is not a data error. The intermediate stops are available in the realtime data, but are not part of the regular GTFS feed, as they are not passenger data. The case is here an intercity train that passes along stops. This has been design already since 2013 as part of the original OTP realtime implementation.
From my point of view OTP should not judge the data other than guard it time is increasing, and just apply the data from the stop times it knows about. The original implementation in OTP1 actually replaced the entire journey pattern with the new sequence which gives the same effect.
Did it do that for trips of schedule relationship SCHEDULED?
The code that produces this error was actually written by Jordan Verwer back in the day. We tried to preserve the original behaviour when refactoring all of this but maybe this case didn't have a test. I don't think it was changed on purpose. One thing to consider: the statistics about whether the update was successful are a recent addition. Is it possible that this never worked and you just didn't notice?
In any case, we are very happy to talk about all of this and would love to have OV contribute to OTP2.
Did it do that for trips of schedule relationship SCHEDULED?
Yes. At the time we basically defined the entire OTP Realtime interface to allow deviations to work natively.
The code that produces this error was actually written by Jordan Verwer back in the day. We tried to preserve the original behaviour when refactoring all of this but maybe this case didn't have a test. I don't think it was changed on purpose. One thing to consider: the statistics about whether the update was successful are a recent addition. Is it possible that this never worked and you just didn't notice?
Lets say I don't know if it ever worked with OTP2. It certainly did with OTP1. As it was part of the funding to have trip deviations, and a testset was created. https://github.com/plannerstack/testset You will recognise the person that made the last commit :-)
In any case, we are very happy to talk about all of this and would love to have OV contribute to OTP2.
I'll add @sven4all in this discussion too.
To add some rationale why the times are added in the first place. Some applications are drawing vehicle positions on a map by interpolating between stations (since not all trains in The Netherlands have locations). Having the intermediate times available allows linear interpolation at a better granularity. In addition you can do cool things with network analysis. So yes, it is a use case outside of the realm of "travel information" it is certainly not "inconsistent" with the generic GTFS feed, if only these stop_times would be updated.
Hi and welcome to OTP.
I feel that this is definitely a data error and should be fixed in the GTFS-RT feed. Why does the feed contain stops that the train does no stop at? SKIPPED means that there was a plan to stop at this stop but for some reason that won't happen anymore. How is OTP supposed to tell the difference between a genuinely skipped stop and your example?
Whilst the preferred option is to fix the RT input, dealing with messy feeds is just a fact of life that we as the OTP developer often have to deal with.
We decided a while ago that we are ok with fixing the data in OTP itself as long as we make it explicit, visible and no longer automagically, perhaps through some kind of opt-in clean-up pipeline.
If you're willing to help out with this, we would love you to join our dev meetings twice a week or the Gitter chat.
In any case, even if you can't work on it, you can join as it would be good to have someone from the Dutch OV project talk to.
Hi Leonard,
Thanks for the openess and quick response! I would definitely like to join one of those meetings, however, as you have probably noticed, @skinkie is much more well-versed in the Dutch public transport data world, so I'm not sure if I could add anything that he hasn't mentioned yet.
Anyhow, I'll see if I can join some day so that you could ask some more questions!
This a pure speculation but you could try the following: In https://github.com/opentripplanner/OpenTripPlanner/pull/4424 I introduced these typed errors to replace the flood of logs.
You could try and find out if it was this PR that broke your use case. This might give you an indication how to revert it.
@leonardehrenfried I just tried reverting to 2.1.0 (according to github didn't have the commit you mentioned), and although it doesn't neatly specify the "success percentage", I still get flooded with quite a big amount of errors.
It's difficult to say if it's less / more, but seeing that the OTP Webapp doesn't specify "too late" or "on time" for any trains, I feel like it did not work as expected. So I guess there's something more behind it next to that commit.
After some further research, it seems like (some) of the errors come from the following situation:
Say I've got Stop A which has tracks 1 (which is subdivided in 1b and 1a). In my scheduled data (stop_times) this trip is designated to start at track 1. However, the GTFS-RT stream gives the more specific track that is now known, Track 1a. This would result in OTP immidiately not being able to find the first stop, as it now has a different ID than the original planned stop. This makes OTP throw an STOP_SEQUENCE_ERROR.
The problem is is that the match at https://github.com/opentripplanner/OpenTripPlanner/blob/edadd0954fd73fa71cecdf9222f4d8ddfdf7e5c2/src/main/java/org/opentripplanner/model/Timetable.java#L227
is not found. This is because the matching only looks at the stop sequence or stop_ids, however, without a stop sequence, in the situation described above, of course the stop_id matching at
fails.
However, this "updating" of a new more specific stop as first stop is within the GTFS-RT spec. See: https://support.google.com/transitpartners/answer/10106587?hl=en#zippy=%2Cupdate-the-platform
As long as the new stop (platform) is under the same "parentstation", it is still valid. It seems like OTP does not handle this case. This is most likely not the only issue that creates the specified error, but it definitely one of them.
EDIT: What in case of a serious disruption? This could make it so that some stops will not be passed anymore and maybe some new stops are added at another place, this would result in both the stop_sequence and stop_ids not lining up anymore. Isn't it better/easier to just overwrite the whole journey pattern with the updated data instead of matching each stop 1 by one?
I think the way how tripUpdates are applied should be configurable per feed. Andrew and Jorden agreed before we need to establish some kind of quality statement per feed like "this feed is complete and allows trips to be replaced" versus "this feed only makes next stops delays available which should then be propagated for all stops to make them coherent again without time travel".
After doing some more digging, it seems like the "Replacement" "ScheduleRelationship" case comes very close, as it just cancels the old trip and dumps all new stops into the graph. However, it seems like "REPLACEMENT" is depricated in the GTFS spec?
The deprecation is a political move that forces a (complex) specification change. https://github.com/google/transit/issues/113
Even though it's deprecated, the REPLACEMENT
code is still in OTP from the original Dutch implementation, persumably because the spec changed that @skinkie refers to never materialized. Is this correct?
It's worth trying if it still works.
Even though it's deprecated, the
REPLACEMENT
code is still in OTP from the original Dutch implementation, persumably because the spec changed that @skinkie refers to never materialized. Is this correct?It's worth trying if it still works.
Hi Loenard,
I tried customizing our GTFS-RT feed to make all trips "MODIFIED" instead of "SCHEDULED", however, although the import errors were much lower, it seemed like the OTP graph was actually removing most of the trips. As soon as OTP processed the pb file, the routing engine wasn't able to find any train trips anymore. Seems like something went wrong there, but can't put my finger on what.
Anyhow, for my personal implementation, I have now just created my own custom Dutch GTFS-RT feed for trains which works almost perfectly with 95% import success. I noticed the other 5 percent is all international trains with a different stop sequence, so that is still something that would be worth figuring out. Anyhow, I can confirm that the problem with the official Dutch GTFS-RT feed is both the lack of sequence numbers and the many "unofficial" (skipped) stops. I've removed these from my own feed and as stated before, with 95% import success. I guess we have some digging to do on our side as well to make the official feed better!
Good to hear that you managed to the errors down to a low number.
BTW, you're welcome to ask even very detailed questions in our Gitter room: https://gitter.im/opentripplanner/OpenTripPlanner
@koch-t ^^^
For the remaining errors, could it be due to stops that aren't part of the original feed (with the same id) and not in the graph as a result? That would prevent the update from being applied.
For the remaining errors, could it be due to stops that aren't part of the original feed (with the same id) and not in the graph as a result? That would prevent the update from being applied.
The biggest problem (for now) seems to be the following: (this applies mostly for our International trains) In the GTFS files, the stopsequences for international trains go like: "1, 3, 10, 17, 23, 36". This would mean the train passes about 36 stops during its journey. However, the stops that are put into our national "InfoPlus" system from Dutch railways, only specifies about 20 stops (it doesn't specify international "skipped" / "passed" stops). This means the stop sequence generated by my GTFS-RT constructor and the stop sequence in the planned GTFS don't line up any more. This makes the importing immidiately fail, as we have the skipped stops in the feed, which OTP doesn't like, and the sequences don't line up any more. In this case, it would just be better to overwrite the whole journey pattern instead of matching each stop one by one.
For a high quality feed. Is there ever a reason not to overwrite everything?
@Arilith BTW, you don't need the sequence number. You can also just use the stop id. (Of course if you have a circular route this will not work, but these are pretty rare, particularly for trains.)
For a high quality feed. Is there ever a reason not to overwrite everything?
If your input data is good, I don't see a reason why you should not.
For a high quality feed. Is there ever a reason not to overwrite everything?
If your input data is good, I don't see a reason why you should not.
Is it an idea to have a configuration option to set the default behavior?
I think it is.
We decided a while ago that OTP is now a sort of enterprise software and we don't shy away from having lots of configuration options. We also have automatic generation of the documentation for these config options. We prefer that over everybody having their own fork and implementing their custom logic downstream.
You could also test/fix the REPLACEMENT schedule relationship again which does what you want.
Also the way to define platform changes, such as from 1 to 1a, was recently adopted in the official GTFS-RT protobuf schema in https://github.com/google/transit/pull/219. It is not currently implemented in OTP, but should be quite straight forward for you to implement.
@Arilith BTW, you don't need the sequence number. You can also just use the stop id. (Of course if you have a circular route this will not work, but these are pretty rare, particularly for trains.)
For a high quality feed. Is there ever a reason not to overwrite everything?
If your input data is good, I don't see a reason why you should not.
I added the sequence number as the current way of dealing with platform changes does not work with only stop ids. For example, when we have a train that is planned from Amsterdam Centraal platform 4 and is later updated to depart from platform 4b after more information has become available, that trip would instantly fail to be inserted, as the stop ids don't match anymore (I debugged OTP step for step and found this out, not sure if that's correct behaviour). This is the same problems for stops that are on the route and have changed platforms.
This creates a massive issue, as platform changes are quite common. I've only re-created our trainUpdates.pb until now, and I feel like the 10% of errors in the tripUpdates.pb from other vehicles is mostly due to the same problem described above.
I could try removing the stop sequence for international trains, but it also sometimes happens (very rarely) with local trains, that the planned sequence doesn't line up with the actual sequence. (Think of a train that was planned to go through station X, but there was a switch issue, which made it go through station Y and Z which were not even in the original planning).
Another problem @skinkie and I actively discussed is how to deal with "TVV" (Train Replacing Transport). Currently, if I have a trip that is not in the pre-defined GTFS definition, but my realtime data has it, it can not be inserted into the OTP graph as the tripId won't be found. (And/or serviceId) (As even the handleAddedTrip method checks for these properties)
This is quite a big problem for scenarios where a big disruption happens, and non-planned busses/replacement transport is actived. What is your guys' view on this? Is there a "to the book" solution for this?
About changing everything to "Modified", this would (I think) only solve the stop sequence errors for the International train problem I described earlier (though, as described, there's no clear "I'm wrong" sign in our dataset). But for all new trips, this also won't work due to the problems with missing trip information.
It would be nice to have a way of adding a whole new trip (with headsign, serviceId, etc) with a linked "originalTripId" so that it could still be referenced to the original (now cancelled trip).
Now I'm very new to GTFS-RT, so maybe this is already in the works, but this is what I experienced over the last week. IMO, GTFS(RT) is quite limited, especially in comparison to the (realtime)data we work with nationally.
For adding completely new route you will be interested in this PR: https://github.com/opentripplanner/OpenTripPlanner/pull/4667
I'm using it for dynamically adding carpool "routes" but technically that is the same as an emergency rail replacement service.
It adds the ability through a protobuf extension but if it's more than a single organization using it, it would be worth getting it into the official spec. Also @skinkie has lots of experience doing just that.
Oh that PR looks good! Thanks for the heads up. As soon as it is accepted, I'll definitely build that version and try it out.
Also the way to define platform changes, such as from 1 to 1a, was recently adopted in the official GTFS-RT protobuf schema in google/transit#219. It is not currently implemented in OTP, but should be quite straight forward for you to implement.
Experimental feature. Would not be required if data is 'just' replaced.
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 30 days
@Arilith did you have a patch?
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 30 days
Keep open.
Keep open.
@Arilith did you have a patch?
I have not patched OTP directly (yet). I have been looking into #5230, to fix the current issue of there not being an "official" way to have platform changes under the same station using the SCHEDULED relationship, even though according to the spec this should be possible. However, this seems to be a little more complex than what I have had the time for to put into developing, so that'll take some time.
Currently I'm using a modified GTFS-RT stream that makes use of the (sadly) deprecated MODIFIED schedulerelationship. For "ritbeeldmatchende (replacement / extended)" trains I simply use the CANCEL / ADDED combination. This generally results in a fairly stable way to make OTP consume our Dutch InfoPlus system, with about 99-100% import success, except for very specific cases.
I can always try to take a look where exactly our official (openOV/OVAPI) GTFS-RT streams go wrong and try to patch that out, but it seems like that would require a substantial change in the way OTP deals with the SCHEDULED trips, as that currently does not allow any kind of stop modification (as far as I've seen). I'm having a discussion with Leonard today, so if I get any more information about this, I'll update it here.
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 30 days
Keep open.
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 30 days
Closure is stupid.
With that kind of language you're not helping your cause.
Resources are limited and without developers being funded to look into the code, there is no point in pretending that it will be magically fixed.
@leonardehrenfried there is no inteligence in github-actions, it cannot be offended. Lets not forget this used to work in OTP1. Closure does not make issues disappear. That is a really stupid management paradigm.
The good news is that you can keep using this non-standard feature in OTP1 forever.
Expected behavior
The specified trip updates are inserted correctly. Maybe give a warning about not-found stops.
Observed behavior
An INVALID_STOP_SEQUENCE error is thrown for many trips, which causes OTP to fully ignore the message, even though there is valid data.
Version of OTP used (exact commit hash or JAR name)
otp-2.2.0-shaded.jar
Data sets in use (links to GTFS and OSM PBF files)
http://gtfs.ovapi.nl/nl/gtfs-nl.zip
Stripped version of netherlands-latest.osm.pbf https://download.geofabrik.de/europe/netherlands-latest.osm.pbf
Command line used to start OTP
java -Xmx8G -jar otp-2.2.0-shaded.jar --load .
Router config and graph build config JSON
Router config:
Steps to reproduce the problem
Import the specified GTFS and use the given router config. Errors like these will be thrown:
I have already contacted our public transport authority and spoke with a contact. The "problem" is, is that our train-trips overspecifies any updates (even for stops which are not in the "stop-times" files)
Let's say we have a train from Point A to point E. During this the train passed at point B, C and D. The train stops at point C, but not at point B and D. Our GTFS-RT feed will still specify point C and D as "skipped" or "ignore" them. This is probably the cause of the above errors.
Now in the GTFS-RT specification it is specified the stops should be in order, however, the order is correct, there's just "Too many". It would be nice if OTP could ignore these errors and still update the graph as now over 50% of the updates are ignored.
Example trip update with error:
Looking up these stops in stop_times.txt gives:
So the specified stop of "2423251" in the GTFS-RT is not in the stop_times file as the train won't actually stop there, but simply passes it.
Kind regards, Tristan
See attached files for some more data.
OTP Error.txt TrainUpdatesErrors.csv TrainUpdates.txt