ITSLeeds / UK2GTFS

Convert UK transport data (TransXchange / ATOC CIF) to GTFS format in R
https://itsleeds.github.io/UK2GTFS/
GNU General Public License v3.0
37 stars 13 forks source link

Ensure that ongoing services with an unknown EndDate are available in… #32

Closed stupidpupil closed 3 years ago

stupidpupil commented 3 years ago

… the immediate future.

Services may run for long time unchanged. A current example of this is the Celtic Coaches X47 (Wales), with an OperatingPeriod StartDate of 2020-06-29 and no defined EndDate.

The proposed change should ensure that future services are available for a year after they start, and that current services are available for a year from the time of import.

mem48 commented 3 years ago

This would cause problems with a historical analysis of timetable data. I someone converts at 2010 timetable assuming that the service was still running in 2011 may be reasonable. Assuming the same service still runs in 2022 is not.

stupidpupil commented 3 years ago

Can't argue with that. Perhaps some sort of user-configurable option is called for?

mem48 commented 3 years ago

A configurable option would be better, but having too many options can make the code unmanageable. I'm not clear on the use case where you would be planning routes on a GTFS file that had been created more than a year ago. Surly most users will use a recent GTFS file so that they have any recent timetable changes included. Thus the one-year cutoff is sufficient.

stupidpupil commented 3 years ago

Does this reasoning also apply to historical analysis? If you're converting timetables from 2010, are you likely to be planning journeys etc. as if you're in 2010 or 2020? (I'm completely ignorant here.)

I meant an option to switch between "a year from OperatingPeriod > StartDate" vs "at most a year from now (for current services)", but I see your point about complexity.

mem48 commented 3 years ago

Sorry my last comment wasn't very clear.

I'm trying to understand the use case where conversion date + 365 is better than start date + 365

As I've said already it causes problems with historic data. But even if you were trying to produce an OTP graph for today, wouldn't you start by downloading the latest transXchange files? So the difference between start date and conversion date would only be a few weeks? Or are they publishing transXchange files with start date in the distant past?

stupidpupil commented 3 years ago

I wasn't very clear in my opening comment, I think.

Or are they publishing transXchange files with start date in the distant past?

Yes. Well, more than a year ago.

The current Wales TNDS file includes this TxC file for the Celtic X47 with

      <OperatingPeriod>
        <StartDate>2020-06-29</StartDate>
      </OperatingPeriod>

(Downloaded 2 minutes ago, W.zip updated around 7-8pm last night according to the TNDS FTP server, CreationDateTime on the 7th July.)

No sign that it's not still running.

mem48 commented 3 years ago

I see, so this is a serious bug as services can go missing even if you downloaded the files today.

Possible solution:

I think the transxchange files are all timestampped with CreationDateTime so if you assumed that a service was running for CreationDateTime + 365 that would work for current and historical data.

stupidpupil commented 3 years ago

Looks like both CreationDateTime and ModificationDateTime have supposedly been Required since v1.2.

mem48 commented 3 years ago

I that case, do you think you can implement the fix then I'd be happy to accept the pull request.

Would probably be best to do the max of start date + 365 and CreationDateTime + 365 just in case it covers a timetable in the future. Unless the ModificationDateTime is significantly different from CreationDateTime?

I've found that while TransXchange is supposed to be a national standard there are local dialects. So UK2GTFS has baked in assumptions that worked for the locations I was testing at the time. Wales has never been the focus of my work so it seems you have uncovered several flawed assumptions. Keep them coming!

There is potentially some Wales-focused work in the pipeline to bring www.carbon.place to Wales. The 15-minute isochrones are the Transit Stops layers are all based on UK2GTFS and OTP so definitely interested in bug fixes or other suggestions to improve the code.

stupidpupil commented 3 years ago

Had a go at it. Hopefully robust to one of them missing but possibly a bit messier than you'd like.

I'm hoping that automated testing whenever I regenerate the network graph will identify issues from various sources. It's looking like I might be in a position to do automatic checks against Traveline's journey planner as part of that. It's those comparisons that have most helped discover these odd cases so far!

At the moment, using my fork with very minor changes making crude fixes to various issues, I think the only real discrepancies I'm currently finding are to do with the Traveline planner lagging behind TNDS. Any suggestions for the sort trips that might turn up other issues (or regressions?) would be very welcome.

www.carbon.place looks interesting! (Will raise with a Public Health Registrar who I know has just started working on air pollution with WG.)

mem48 commented 3 years ago

Hi @stupidpupil this looks good. so I'm going to merge it now