skinkie / reference

Personal repository where I collect working examples to understand inner workings while building PyNeTExConv
GNU Affero General Public License v3.0
0 stars 1 forks source link

transformation of swiss netex to epip fails #41

Closed ue71603 closed 1 week ago

ue71603 commented 2 weeks ago

The steps in the process:

python swiss_to_db.py C:/Users/ue71603/MG_Daten/conversion/swiss/swiss.zip C:/Users/ue71603/MG_Daten/conversion/swiss/swiss-import.duckdb
python epip_db_to_db.py C:/Users/ue71603/MG_Daten/conversion/swiss/netex-import.duckdb C:/Users/ue71603/MG_Daten/conversion/swiss//netex-import-epip.duckdb
python epip-db-to-xm.py C:/Users/ue71603/MG_Daten/conversion/swiss//netex-import.duckdb C:/Users/ue71603/MG_Daten/conversion/swiss//netex-import-epip.duckdb C:/Users/ue71603/MG_Daten/conversion/swiss//netex.xml.gz
del C:/Users/ue71603/MG_Daten/conversion/swiss/*.duckdb

The first step took about 24h on my machine image

The second step python epip_db_to_db.py C:/Users/ue71603/MG_Daten/conversion/swiss/netex-import.duckdb C:/Users/ue71603/MG_Daten/conversion/swiss//netex-import-epip.duckdb

aborted almost immediately with the following output/error:

(venv) PS C:\Users\ue71603\MG_Daten\github\reference\gtfs-netex-test> python epip_db_to_db.py C:/Users/ue71603/MG_Daten/conversion/swiss/netex-import.duckdb C:/Users/ue71603/MG_Daten/conversion/swiss//netex-import-epip.duckdb        
CREATE TABLE IF NOT EXISTS AvailabilityCondition (id varchar(64) NOT NULL, version varchar(64) NOT NULL, object text NOT NULL, PRIMARY KEY (id, version));
CREATE TABLE IF NOT EXISTS DestinationDisplay (id varchar(64) NOT NULL, version varchar(64) NOT NULL, object text NOT NULL, PRIMARY KEY (id, version));
CREATE TABLE IF NOT EXISTS Direction (id varchar(64) NOT NULL, version varchar(64) NOT NULL, object text NOT NULL, PRIMARY KEY (id, version));
CREATE TABLE IF NOT EXISTS Line (id varchar(64) NOT NULL, version varchar(64) NOT NULL, object text NOT NULL, PRIMARY KEY (id, version));
CREATE TABLE IF NOT EXISTS Notice (id varchar(64) NOT NULL, version varchar(64) NOT NULL, object text NOT NULL, PRIMARY KEY (id, version));
CREATE TABLE IF NOT EXISTS NoticeAssignment (id varchar(64) NOT NULL, version varchar(64) NOT NULL, ordr integer, object text NOT NULL, PRIMARY KEY (id, version, ordr));
CREATE TABLE IF NOT EXISTS Operator (id varchar(64) NOT NULL, version varchar(64) NOT NULL, object text NOT NULL, PRIMARY KEY (id, version));
CREATE TABLE IF NOT EXISTS PassengerStopAssignment (id varchar(64) NOT NULL, version varchar(64) NOT NULL, ordr integer, object text NOT NULL, PRIMARY KEY (id, version, ordr));
CREATE TABLE IF NOT EXISTS RouteLink (id varchar(64) NOT NULL, version varchar(64) NOT NULL, object text NOT NULL, PRIMARY KEY (id, version));
CREATE TABLE IF NOT EXISTS RoutePoint (id varchar(64) NOT NULL, version varchar(64) NOT NULL, object text NOT NULL, PRIMARY KEY (id, version));
CREATE TABLE IF NOT EXISTS ScheduledStopPoint (id varchar(64) NOT NULL, version varchar(64) NOT NULL, object text NOT NULL, PRIMARY KEY (id, version));
CREATE TABLE IF NOT EXISTS ServiceJourney (id varchar(64) NOT NULL, version varchar(64) NOT NULL, object text NOT NULL, PRIMARY KEY (id, version));
CREATE TABLE IF NOT EXISTS ServiceJourneyPattern (id varchar(64) NOT NULL, version varchar(64) NOT NULL, object text NOT NULL, PRIMARY KEY (id, version));
CREATE TABLE IF NOT EXISTS StopPlace (id varchar(64) NOT NULL, version varchar(64) NOT NULL, object text NOT NULL, PRIMARY KEY (id, version));
CREATE TABLE IF NOT EXISTS VehicleType (id varchar(64) NOT NULL, version varchar(64) NOT NULL, object text NOT NULL, PRIMARY KEY (id, version));
epip_line_memory
epip_scheduled_stop_point_memory
epip_service_journey_generator
Traceback (most recent call last):
  File "C:\Users\ue71603\MG_Daten\github\reference\gtfs-netex-test\epip_db_to_db.py", line 51, in <module>
    main(args.source, args.target)
  File "C:\Users\ue71603\MG_Daten\github\reference\gtfs-netex-test\epip_db_to_db.py", line 41, in main
    epip_service_journey_generator(source_database_file, target_database_file, generator_defaults, None)
  File "C:\Users\ue71603\MG_Daten\github\reference\gtfs-netex-test\transformers\epip.py", line 398, in epip_service_journey_generator
    with sqlite3.connect(write_database) as write_con:
  File "C:\Users\ue71603\MG_Daten\github\reference\gtfs-netex-test\transformers\epip.py", line 404, in epip_service_journey_generator
    write_generator(write_con, ServiceJourney, query(read_con), True)
  File "C:\Users\ue71603\MG_Daten\github\reference\gtfs-netex-test\dbaccess.py", line 242, in write_generator
    for a in _prepare3(generator, objectname):
  File "C:\Users\ue71603\MG_Daten\github\reference\gtfs-netex-test\dbaccess.py", line 218, in _prepare3
    for obj in generator3:
  File "C:\Users\ue71603\MG_Daten\github\reference\gtfs-netex-test\transformers\epip.py", line 390, in query
    for sj in _load_generator:
  File "C:\Users\ue71603\MG_Daten\github\reference\gtfs-netex-test\dbaccess.py", line 51, in load_generator
    cur.execute(f"SELECT object FROM {type};")
duckdb.duckdb.InvalidInputException: Invalid Input Error: Python Object "ServiceJourney" of type "type" found on line "C:\Users\ue71603\MG_Daten\github\reference\gtfs-netex-test\transformers\epip.py:390" not suitable for replacement 
scans.
Make sure that "ServiceJourney" is either a pandas.DataFrame, duckdb.DuckDBPyRelation, pyarrow Table, Dataset, RecordBatchReader, Scanner, or NumPy ndarrays with supported format
(venv) PS C:\Users\ue71603\MG_Daten\github\reference\gtfs-netex-test>

Perhaps I will have to truncate the Swiss data set, so that it can be managed faster.

ue71603 commented 2 weeks ago

swiss.zip reduced set I will use for the next test. Only a few TT files.

ue71603 commented 2 weeks ago

had to use epip_db_to_db.py as swiss_db_to_db.py is not yet finished.:

ue71603 commented 2 weeks ago

Tested the swiss conversion again:


(venv) PS C:\Users\ue71603\MG_Daten\github\reference\gtfs-netex-test> del C:/Users/ue71603/MG_Daten/conversion/swiss/*.duckdb
(venv) PS C:\Users\ue71603\MG_Daten\github\reference\gtfs-netex-test> python swiss_to_db.py C:/Users/ue71603/MG_Daten/conversion/swiss/swiss.zip C:/Users/ue71603/MG_Daten/conversion/swiss/swiss-import.duckdb
CREATE TABLE IF NOT EXISTS AvailabilityCondition (id varchar(64) NOT NULL, version varchar(64) NOT NULL, object text NOT NULL, PRIMARY KEY (id, version));
CREATE TABLE IF NOT EXISTS DestinationDisplay (id varchar(64) NOT NULL, version varchar(64) NOT NULL, object text NOT NULL, PRIMARY KEY (id, version));
CREATE TABLE IF NOT EXISTS Direction (id varchar(64) NOT NULL, version varchar(64) NOT NULL, object text NOT NULL, PRIMARY KEY (id, version));
CREATE TABLE IF NOT EXISTS Line (id varchar(64) NOT NULL, version varchar(64) NOT NULL, object text NOT NULL, PRIMARY KEY (id, version));
CREATE TABLE IF NOT EXISTS Operator (id varchar(64) NOT NULL, version varchar(64) NOT NULL, object text NOT NULL, PRIMARY KEY (id, version));
CREATE TABLE IF NOT EXISTS PassengerStopAssignment (id varchar(64) NOT NULL, version varchar(64) NOT NULL, ordr integer, object text NOT NULL, PRIMARY KEY (id, version, ordr));
CREATE TABLE IF NOT EXISTS ScheduledStopPoint (id varchar(64) NOT NULL, version varchar(64) NOT NULL, object text NOT NULL, PRIMARY KEY (id, version));
CREATE TABLE IF NOT EXISTS ServiceJourney (id varchar(64) NOT NULL, version varchar(64) NOT NULL, object text NOT NULL, PRIMARY KEY (id, version));
CREATE TABLE IF NOT EXISTS StopPlace (id varchar(64) NOT NULL, version varchar(64) NOT NULL, object text NOT NULL, PRIMARY KEY (id, version));
CREATE TABLE IF NOT EXISTS TopographicPlace (id varchar(64) NOT NULL, version varchar(64) NOT NULL, object text NOT NULL, PRIMARY KEY (id, version));
CREATE TABLE IF NOT EXISTS VehicleType (id varchar(64) NOT NULL, version varchar(64) NOT NULL, object text NOT NULL, PRIMARY KEY (id, version));
(venv) PS C:\Users\ue71603\MG_Daten\github\reference\gtfs-netex-test> python epip_db_to_db.py C:/Users/ue71603/MG_Daten/conversion/swiss/netex-import.duckdb C:/Users/ue71603/MG_Daten/conversion/swiss//netex-import-epip.duckdb       
CREATE TABLE IF NOT EXISTS AvailabilityCondition (id varchar(64) NOT NULL, version varchar(64) NOT NULL, object text NOT NULL, PRIMARY KEY (id, version));
CREATE TABLE IF NOT EXISTS DestinationDisplay (id varchar(64) NOT NULL, version varchar(64) NOT NULL, object text NOT NULL, PRIMARY KEY (id, version));
CREATE TABLE IF NOT EXISTS Direction (id varchar(64) NOT NULL, version varchar(64) NOT NULL, object text NOT NULL, PRIMARY KEY (id, version));
CREATE TABLE IF NOT EXISTS Line (id varchar(64) NOT NULL, version varchar(64) NOT NULL, object text NOT NULL, PRIMARY KEY (id, version));
CREATE TABLE IF NOT EXISTS Notice (id varchar(64) NOT NULL, version varchar(64) NOT NULL, object text NOT NULL, PRIMARY KEY (id, version));
CREATE TABLE IF NOT EXISTS NoticeAssignment (id varchar(64) NOT NULL, version varchar(64) NOT NULL, ordr integer, object text NOT NULL, PRIMARY KEY (id, version, ordr));
CREATE TABLE IF NOT EXISTS Operator (id varchar(64) NOT NULL, version varchar(64) NOT NULL, object text NOT NULL, PRIMARY KEY (id, version));
CREATE TABLE IF NOT EXISTS PassengerStopAssignment (id varchar(64) NOT NULL, version varchar(64) NOT NULL, ordr integer, object text NOT NULL, PRIMARY KEY (id, version, ordr));
CREATE TABLE IF NOT EXISTS RouteLink (id varchar(64) NOT NULL, version varchar(64) NOT NULL, object text NOT NULL, PRIMARY KEY (id, version));
CREATE TABLE IF NOT EXISTS RoutePoint (id varchar(64) NOT NULL, version varchar(64) NOT NULL, object text NOT NULL, PRIMARY KEY (id, version));
CREATE TABLE IF NOT EXISTS ScheduledStopPoint (id varchar(64) NOT NULL, version varchar(64) NOT NULL, object text NOT NULL, PRIMARY KEY (id, version));
CREATE TABLE IF NOT EXISTS ServiceJourney (id varchar(64) NOT NULL, version varchar(64) NOT NULL, object text NOT NULL, PRIMARY KEY (id, version));
CREATE TABLE IF NOT EXISTS ServiceJourneyPattern (id varchar(64) NOT NULL, version varchar(64) NOT NULL, object text NOT NULL, PRIMARY KEY (id, version));
CREATE TABLE IF NOT EXISTS StopPlace (id varchar(64) NOT NULL, version varchar(64) NOT NULL, object text NOT NULL, PRIMARY KEY (id, version));
epip_line_memory
epip_scheduled_stop_point_memory
epip_site_frame_memory
epip_service_journey_generator
 ServiceJourney 0

Traceback (most recent call last):
    main(args.source, args.target)
  File "C:\Users\ue71603\MG_Daten\github\reference\gtfs-netex-test\epip_db_to_db.py", line 41, in main
    epip_service_journey_generator(source_database_file, target_database_file, generator_defaults, None)
  File "C:\Users\ue71603\MG_Daten\github\reference\gtfs-netex-test\transformers\epip.py", line 398, in epip_service_journey_generator
    with sqlite3.connect(write_database) as write_con:
  File "C:\Users\ue71603\MG_Daten\github\reference\gtfs-netex-test\transformers\epip.py", line 407, in epip_service_journey_generator
    service_calendar = get_service_calendar(day_types, uic_operating_periods, day_type_assignments, generator_defaults)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\ue71603\MG_Daten\github\reference\gtfs-netex-test\transformers\epip.py", line 329, in get_service_calendar
    from_date = min([uic.from_operating_day_ref_or_from_date.to_datetime() for uic in uic_operating_periods])
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: min() arg is an empty sequence
skinkie commented 2 weeks ago

Is this the latest swiss netex?

ue71603 commented 2 weeks ago

no. This is the reduced version, that is attached (only a few time table frames)

ue71603 commented 2 weeks ago

Is this a problem?

skinkie commented 2 weeks ago

No, I just wanted a way I could 100% reproduce.

ue71603 commented 2 weeks ago

the file is attached... (helps with speed and possibilit to reproduce). That's why I fit them to what github can load.

ue71603 commented 1 week ago

will open a new issue.