Open ue71603 opened 2 months ago
I think swiss netex is calls, hence it might be that the GTFS one could work.
Which sequence of py files to use? Then I test it.
any-to-epip.py. you read a gz. Is this a single file xml? The Swiss file is separated by Frame. Does this need preprocessing? and do you have a .py that does that already?
This would be some building blocks for the pipeline.
Can it be that any-to-epip is very minimalistic? It doesn't copy ResourceFrame it seems. Also. I didn't see anyhwere where it should take the Call and do something with it.
any-to-epip.py. you read a gz. Is this a single file xml? The Swiss file is separated by Frame. Does this need preprocessing? and do you have a .py that does that already?
Because it reads an xpath expression, it does not consider frames, just the instances of the objects. This is a potential issue if the frame defaults are different. Hence the frame defaults might have to be applied first. But I have not found a complex case yet that mixes frame defaults.
Can it be that any-to-epip is very minimalistic? It doesn't copy ResourceFrame it seems. Also. I didn't see anyhwere where it should take the Call and do something with it.
Magic happens here:
timetabledpassingtimesprofile = TimetablePassingTimesProfile(codespace, version, service_journeys, service_journey_patterns)
timetabledpassingtimesprofile.getTimetabledPassingTimes(clean=True)
You likely now understand how this would look like in a node based conversion :-)
And how to deal with the frame based files: https://opentransportdata.swiss/de/dataset/timetablenetex_2024/resource/7faba8f2-797f-41f8-81e6-d8b3f55e9e85
collate some manually for testing?
write a collator?
You mean Line based deliveries + shared file like Norway?
We have a COMMON, RESOURCE, SERVICE, SERVICECALENDAR,SITE file and then linebased TIMETABLE.
I may be that theTIMETABLE are multilines ("Betriebszweig")
Yes, in that case you would need to heuristically establish which files you want to load to fetch the concept. But as long you could lazy load them, it is not an issue.
One of the other things I would like to implement is actual lazy loading. So that concepts can be resolved 'just in time'.
Lets assume our case: the pool to be loaded would be the COMMON, RESOURCE, SERVICE, SERVICECALENDAR,SITE and then the timetable can be processed each. So how to do more parameter:
currently any-to-epip is doing file by file work.
currently any-to-epip is doing file by file work.
Correct, only the Nordic export (not-validated) is splitting file. The big question is: should the separate files be loaded in for example a database first (this would allow resolving interfile relationships), or do we blissfully ignore those, and just go directly for a Linebased-to-Linebased conversion.
Would need 2-3 GB in memory, when we load them beforehand. As for a database => you are the architect.
with which subelement?