skinkie / reference

Personal repository where I collect working examples to understand inner workings while building PyNeTExConv
GNU Affero General Public License v3.0
0 stars 1 forks source link

Can I test a conversion swiss netex 2 epip already? #15

Open ue71603 opened 2 months ago

ue71603 commented 2 months ago

with which subelement?

skinkie commented 2 months ago

I think swiss netex is calls, hence it might be that the GTFS one could work.

ue71603 commented 2 months ago

Which sequence of py files to use? Then I test it.

ue71603 commented 2 months ago

any-to-epip.py. you read a gz. Is this a single file xml? The Swiss file is separated by Frame. Does this need preprocessing? and do you have a .py that does that already?

ue71603 commented 2 months ago

This would be some building blocks for the pipeline.

ue71603 commented 2 months ago

Can it be that any-to-epip is very minimalistic? It doesn't copy ResourceFrame it seems. Also. I didn't see anyhwere where it should take the Call and do something with it.

skinkie commented 2 months ago

any-to-epip.py. you read a gz. Is this a single file xml? The Swiss file is separated by Frame. Does this need preprocessing? and do you have a .py that does that already?

Because it reads an xpath expression, it does not consider frames, just the instances of the objects. This is a potential issue if the frame defaults are different. Hence the frame defaults might have to be applied first. But I have not found a complex case yet that mixes frame defaults.

Can it be that any-to-epip is very minimalistic? It doesn't copy ResourceFrame it seems. Also. I didn't see anyhwere where it should take the Call and do something with it.

Magic happens here:

    timetabledpassingtimesprofile = TimetablePassingTimesProfile(codespace, version, service_journeys, service_journey_patterns)
    timetabledpassingtimesprofile.getTimetabledPassingTimes(clean=True)

You likely now understand how this would look like in a node based conversion :-)

ue71603 commented 2 months ago

And how to deal with the frame based files: https://opentransportdata.swiss/de/dataset/timetablenetex_2024/resource/7faba8f2-797f-41f8-81e6-d8b3f55e9e85 collate some manually for testing?
write a collator?

skinkie commented 2 months ago

You mean Line based deliveries + shared file like Norway?

ue71603 commented 2 months ago

We have a COMMON, RESOURCE, SERVICE, SERVICECALENDAR,SITE file and then linebased TIMETABLE. image

ue71603 commented 2 months ago

I may be that theTIMETABLE are multilines ("Betriebszweig")

skinkie commented 2 months ago

Yes, in that case you would need to heuristically establish which files you want to load to fetch the concept. But as long you could lazy load them, it is not an issue.

One of the other things I would like to implement is actual lazy loading. So that concepts can be resolved 'just in time'.

ue71603 commented 2 months ago

Lets assume our case: the pool to be loaded would be the COMMON, RESOURCE, SERVICE, SERVICECALENDAR,SITE and then the timetable can be processed each. So how to do more parameter:

ue71603 commented 2 months ago

currently any-to-epip is doing file by file work.

skinkie commented 2 months ago

currently any-to-epip is doing file by file work.

Correct, only the Nordic export (not-validated) is splitting file. The big question is: should the separate files be loaded in for example a database first (this would allow resolving interfile relationships), or do we blissfully ignore those, and just go directly for a Linebased-to-Linebased conversion.

ue71603 commented 2 months ago

Would need 2-3 GB in memory, when we load them beforehand. As for a database => you are the architect.