skinkie / reference

Personal repository where I collect working examples to understand inner workings while building PyNeTExConv
GNU Affero General Public License v3.0
1 stars 1 forks source link

Swiss conversions loses agency.txt and other validation errors (with partial analysis) #122

Open ue71603 opened 1 month ago

ue71603 commented 1 month ago

Started with: https://github.com/user-attachments/files/17267132/swiss4-netex.zip

Intermediate EPIP NeTEx swiss4-netex.zip

The GTFS is too big, so I could not upload it. But it is easy to generate with the script runner.


    {
    "block": "swiss4",
    "scripts": [
        {"script": "clean_tmp", "args": "%%dir%%"},
        {"script": "swiss_to_db.py", "args": "--log=%%log%% d:/swiss4.zip %%dir%%/swiss-import.duckdb"},
        {"script": "epip_db_to_db.py", "args": "--log=%%log%% %%dir%%/swiss-import.duckdb %%dir%%/netex-database.duckdb"},
        {"script": "epip_db_to_xml.py", "args": "--log=%%log%% %%dir%%/swiss-import.duckdb %%dir%%/netex-database.duckdb %%dir%%/%%block%%-netex.xml"},
        {"script": "aux_assertions.py", "args": "--log=%%log%% ./aux_test_input/swiss-assertions.txt %%dir%%/%%block%%-netex.xml"},
        {"script": "aux_netex_stats.py", "args": "--log=%%log%% %%dir%%/%%block%%-netex.xml"},
        {"script": "netex_to_db.py", "args": "--log=%%log%% %%dir%%/%%block%%-netex.xml %%dir%%/netex2-database.duckdb"},
        {"script": "related_explorer.py", "args": "%%dir%%/netex2-database.duckdb ServiceJourney random %%dir%%/sj.xml"},
        {"script": "netex_db_to_gtfs.py", "args": "--log=%%log%% %%dir%%/netex-database.duckdb %%dir%%/%%block%%-gtfs.zip"},
        {"script": "aux_gtfs_check.py", "args": "--log=%%log%% %%dir%%/%%block%%-gtfs.zip"},
        {"script": "gtfs_show_map.py", "args": "--log=%%log%% %%dir%%/%%block%%-gtfs.zip %%dir%%/%%block%%-map.html"},
        {"script": "related_explorer.py", "args": "%%dir%%/netex2-database.duckdb Line random %%dir%%/line.xml"},
        {"script": "related_explorer.py", "args": "%%dir%%/swiss-import.duckdb ServiceJourney random %%dir%%/sj1.xml"},
        {"script": "related_explorer.py", "args": "%%dir%%/netex-database.duckdb ServiceJourney random %%dir%%/sj2.xml"}
        ]
  }

The following errors occur ins swiss4-gtfs.zip (the output) (https://gtfs-validator-results.mobilitydata.org/a73522dc-3d79-458c-a0db-1017fbe1d3a3/report.html):

foreign key violation (happens also by blablacar netex -> epip -> gtfs)

Example: line 7 in stop_times.txt: points to: ch:1:ScheduledStopPoint:8506100:3 This key does not exist in stops.txt There only exist


ch:1:ScheduledStopPoint:8506097-0-Gen,,"Sion, Roseaux",,46.2173100,7.3412570,,,0,ch:1:SLOID:8506097,,,,
ch:1:SLOID:8506100,,Frauenfeld,,47.5581620,8.8965640,,,1,,,,,

The intermediary NeTEx contains it (without coordinates) and a generated one with coordinates:

            <ScheduledStopPoint id="ch:1:ScheduledStopPoint:8506100:3" version="any">
              <keyList>
                <KeyValue>
                  <Key>DIDOK</Key>
                  <Value>8506100</Value>
                </KeyValue>
              </keyList>
              <Name lang="de">Frauenfeld</Name>
              <ShortName lang="de">3</ShortName>
            </ScheduledStopPoint>
            <ScheduledStopPoint id="ch:1:ScheduledStopPoint:8506100-0-Gen" version="any">
              <keyList>
                <KeyValue>
                  <Key>DIDOK</Key>
                  <Value>8506100</Value>
                </KeyValue>
              </keyList>
              <Name lang="de">Frauenfeld</Name>
              <Location>
                <Longitude>8.896564</Longitude>
                <Latitude>47.558162</Latitude>
                <Altitude>404</Altitude>
              </Location>
              <ShortName>Frauenfeld</ShortName>
            </ScheduledStopPoint>

ScheduledStopPoint:8506100:3 is used as point in a ServiceJourneyPattern. No PSA exists

PSA situation:

            <PassengerStopAssignment id="ch:1:PassengerStopAssignment:8506100:1" version="any" order="12671">
              <ScheduledStopPointRef version="any" ref="ch:1:ScheduledStopPoint:8506100-0-Gen"/>
              <StopPlaceRef version="any" ref="ch:1:SLOID:8506100" versionRef="any"/>
              <QuayRef version="any" ref="ch:1:sloid:6100:1:1" versionRef="any"/>
            </PassengerStopAssignment>
            <PassengerStopAssignment id="ch:1:PassengerStopAssignment:8506100:11" version="any" order="12672">
              <ScheduledStopPointRef version="any" ref="ch:1:ScheduledStopPoint:8506100-0-Gen"/>
              <StopPlaceRef version="any" ref="ch:1:SLOID:8506100" versionRef="any"/>
              <QuayRef version="any" ref="ch:1:sloid:6100:0:11" versionRef="any"/>
            </PassengerStopAssignment>
            <PassengerStopAssignment id="ch:1:PassengerStopAssignment:8506100:2" version="any" order="12673">
              <ScheduledStopPointRef version="any" ref="ch:1:ScheduledStopPoint:8506100-0-Gen"/>
              <StopPlaceRef version="any" ref="ch:1:SLOID:8506100" versionRef="any"/>
              <QuayRef version="any" ref="ch:1:sloid:6100:2:2" versionRef="any"/>
            </PassengerStopAssignment>
            <PassengerStopAssignment id="ch:1:PassengerStopAssignment:8506100:3" version="any" order="12674">
              <ScheduledStopPointRef version="any" ref="ch:1:ScheduledStopPoint:8506100-0-Gen"/>
              <StopPlaceRef version="any" ref="ch:1:SLOID:8506100" versionRef="any"/>
              <QuayRef version="any" ref="ch:1:sloid:6100:2:3" versionRef="any"/>
            </PassengerStopAssignment>
            <PassengerStopAssignment id="ch:1:PassengerStopAssignment:8506100_0_Gen" version="any" order="12675">
              <ScheduledStopPointRef ref="ch:1:ScheduledStopPoint:8506100-0-Gen" versionRef="any"/>
              <StopPlaceRef version="any" ref="ch:1:SLOID:8506100" versionRef="any"/>
            </PassengerStopAssignment>

So here already the conversion Swiss NeTEx -> EPIP has failed. and made somehting strange

missing_calendar_and_calendar_date_files (happens also by blablacar gtfs -> netex -> epip -> gtfs)

The service_id with the DayTpe exist in the GTFS (trips.txt, but are not written to the GTFS.

agency.txt missing (happens only here with swiss 4)

The intermediate NeTEx (GTFS -> EPIP) still contains OperatorRef of the Line, the ResponsibilitySets and the Operators. It doesn't have Operators on the ServiceJourneys (besides the ResponsibilitySet. So no clue, what went wrong.

stop_time_with_arrival_before_previous_departure_time (only in swiss4)

e.g. tripId ch:1:ServiceJourney:ch:1:sjyid:100001:8995-00191029.j24_376

The NeTEX EPIP that was produced is correct with Offset set:

            <ServiceJourney id="ch:1:ServiceJourney:ch:1:sjyid:100001:8995-001_91029_.j24_376" version="any" responsibilitySetRef="ch:1:ResponsibilitySet:SBB_11_SBB">
              <TransportMode>rail</TransportMode>
              <TypeOfProductCategoryRef ref="ch:1:TypeOfProductCategory:S" versionRef="any"/>
              <TypeOfServiceRef ref="ch:1:TypeOfService:1" versionRef="any"/>
              <ServiceAlteration>planned</ServiceAlteration>
              <DepartureTime>23:51:00</DepartureTime>
              <dayTypes>
                <DayTypeRef version="any" ref="ch:1:DayType:6r680"/>
              </dayTypes>
              <ServiceJourneyPatternRef version="any" ref="OPENOV:ServiceJourneyPattern:3e840d80"/>
              <LineRef ref="ch:2:Line:11.S.S29" versionRef="any"/>
              <DirectionType>inbound</DirectionType>
              <passingTimes>
                <TimetabledPassingTime id="ch:1:TimetabledPassingTime:ch:1:ServiceJourney:ch:1:sjyid:100001:8995-001_91029_.j24_376_1" version="any" derivedFromVersionRef="any" derivedFromObjectRef="ch:1:Call:ch:1:ServiceJourney:ch:1:sjyid:100001:8995-001_91029_.j24_376_1">
                  <StopPointInJourneyPatternRef version="any" ref="OPENOV:StopPointInJourneyPattern:3e840d80-1" order="1"/>
                  <DepartureTime>23:51:00</DepartureTime>
                </TimetabledPassingTime>
                <TimetabledPassingTime id="ch:1:TimetabledPassingTime:ch:1:ServiceJourney:ch:1:sjyid:100001:8995-001_91029_.j24_376_2" version="any" derivedFromVersionRef="any" derivedFromObjectRef="ch:1:Call:ch:1:ServiceJourney:ch:1:sjyid:100001:8995-001_91029_.j24_376_2">
                  <StopPointInJourneyPatternRef version="any" ref="OPENOV:StopPointInJourneyPattern:3e840d80-2" order="2"/>
                  <ArrivalTime>23:53:00</ArrivalTime>
                  <DepartureTime>23:53:00</DepartureTime>
                </TimetabledPassingTime>
                <TimetabledPassingTime id="ch:1:TimetabledPassingTime:ch:1:ServiceJourney:ch:1:sjyid:100001:8995-001_91029_.j24_376_3" version="any" derivedFromVersionRef="any" derivedFromObjectRef="ch:1:Call:ch:1:ServiceJourney:ch:1:sjyid:100001:8995-001_91029_.j24_376_3">
                  <StopPointInJourneyPatternRef version="any" ref="OPENOV:StopPointInJourneyPattern:3e840d80-3" order="3"/>
                  <ArrivalTime>23:57:00</ArrivalTime>
                  <DepartureTime>23:57:00</DepartureTime>
                </TimetabledPassingTime>
                <TimetabledPassingTime id="ch:1:TimetabledPassingTime:ch:1:ServiceJourney:ch:1:sjyid:100001:8995-001_91029_.j24_376_4" version="any" derivedFromVersionRef="any" derivedFromObjectRef="ch:1:Call:ch:1:ServiceJourney:ch:1:sjyid:100001:8995-001_91029_.j24_376_4">
                  <StopPointInJourneyPatternRef version="any" ref="OPENOV:StopPointInJourneyPattern:3e840d80-4" order="4"/>
                  <ArrivalTime>00:01:00</ArrivalTime>
                  <ArrivalDayOffset>1</ArrivalDayOffset>
                  <DepartureTime>00:01:00</DepartureTime>
                  <DepartureDayOffset>1</DepartureDayOffset>
                </TimetabledPassingTime>
                <TimetabledPassingTime id="ch:1:TimetabledPassingTime:ch:1:ServiceJourney:ch:1:sjyid:100001:8995-001_91029_.j24_376_5" version="any" derivedFromVersionRef="any" derivedFromObjectRef="ch:1:Call:ch:1:ServiceJourney:ch:1:sjyid:100001:8995-001_91029_.j24_376_5">
                  <StopPointInJourneyPatternRef version="any" ref="OPENOV:StopPointInJourneyPattern:3e840d80-5" order="5"/>
                  <ArrivalTime>00:04:00</ArrivalTime>
                  <ArrivalDayOffset>1</ArrivalDayOffset>
                  <DepartureTime>00:04:00</DepartureTime>
                  <DepartureDayOffset>1</DepartureDayOffset>
                </TimetabledPassingTime>
                <TimetabledPassingTime id="ch:1:TimetabledPassingTime:ch:1:ServiceJourney:ch:1:sjyid:100001:8995-001_91029_.j24_376_6" version="any" derivedFromVersionRef="any" derivedFromObjectRef="ch:1:Call:ch:1:ServiceJourney:ch:1:sjyid:100001:8995-001_91029_.j24_376_6">
                  <StopPointInJourneyPatternRef version="any" ref="OPENOV:StopPointInJourneyPattern:3e840d80-6" order="6"/>
                  <ArrivalTime>00:07:00</ArrivalTime>
                  <ArrivalDayOffset>1</ArrivalDayOffset>
                  <DepartureTime>00:07:00</DepartureTime>
                  <DepartureDayOffset>1</DepartureDayOffset>
                </TimetabledPassingTime>

The processing seems to be off only for the last stop (Olten):

ch:1:ServiceJourney:ch:1:sjyid:100001:8995-001_91029_.j24_376,23:51:00,23:51:00,ch:1:ScheduledStopPoint:8502007:2,1,,,,,,,1
ch:1:ServiceJourney:ch:1:sjyid:100001:8995-001_91029_.j24_376,23:53:00,23:53:00,ch:1:ScheduledStopPoint:8502010:2,2,,,,,,,1
ch:1:ServiceJourney:ch:1:sjyid:100001:8995-001_91029_.j24_376,23:57:00,23:57:00,ch:1:ScheduledStopPoint:8502006:2,3,,,,,,,1
ch:1:ServiceJourney:ch:1:sjyid:100001:8995-001_91029_.j24_376,24:01:00,24:01:00,ch:1:ScheduledStopPoint:8502005:2,4,,,,,,,1
ch:1:ServiceJourney:ch:1:sjyid:100001:8995-001_91029_.j24_376,24:04:00,24:04:00,ch:1:ScheduledStopPoint:8502004:3,5,,,,,,,1
ch:1:ServiceJourney:ch:1:sjyid:100001:8995-001_91029_.j24_376,24:07:00,24:07:00,ch:1:ScheduledStopPoint:8502003:2,6,,,,,,,1
ch:1:ServiceJourney:ch:1:sjyid:100001:8995-001_91029_.j24_376,24:09:00,24:09:00,ch:1:ScheduledStopPoint:8502002:2,7,,,,,,,1
ch:1:ServiceJourney:ch:1:sjyid:100001:8995-001_91029_.j24_376,24:12:00,24:13:00,ch:1:ScheduledStopPoint:8502001:3,8,,,,,,,1
ch:1:ServiceJourney:ch:1:sjyid:100001:8995-001_91029_.j24_376,24:17:00,24:17:00,ch:1:ScheduledStopPoint:8502000:2,9,,,,,,,1
ch:1:ServiceJourney:ch:1:sjyid:100001:8995-001_91029_.j24_376,24:22:00,24:23:00,ch:1:ScheduledStopPoint:8500218:11,10,,,,,,,1
ch:1:ServiceJourney:ch:1:sjyid:100001:8995-001_91029_.j24_376,00:32:00,00:32:00,ch:1:ScheduledStopPoint:8502113:1,11,,,,,,,1
skinkie commented 1 month ago

Could you in future split issues? Because several issues you have created combine multiple issues, including questions.

ue71603 commented 1 month ago

can do