cal-itp / reports

GTFS data quality reports for California transit providers
https://reports.calitp.org
GNU Affero General Public License v3.0
7 stars 0 forks source link

Bug: Place to dump random RT data issues that we see #314

Open vevetron opened 2 months ago

vevetron commented 2 months ago

Describe the bug I see a few instances where the RT data is funny. I thought it made sense to post the cases here instead of making separate tickets for each.

These are probably not issues with reports itself, just the data behind it.

vevetron commented 2 months ago

https://reports.calitp.org/gtfs_schedule/2023/09/289/ image Median trip message age is always missing for SLORTA

vevetron commented 2 months ago

https://reports.calitp.org/gtfs_schedule/2024/08/148/ Kings is also missing Median Trip Update Ages

vevetron commented 2 months ago

https://reports.calitp.org/gtfs_schedule/2024/08/243/

Also missing Median Trip Update Ages

vevetron commented 2 months ago

For Kings, TU messages are empty for avg and medians:

select * from mart_gtfs_quality.fct_daily_trip_updates_message_age_summary 
where base64_url = 'aHR0cHM6Ly9rYXJ0LmNvbm5leGlvbnoubmV0L3J0dC9wdWJsaWMvdXRpbGl0eS9ndGZzcmVhbHRpbWUuYXNweC90cmlwdXBkYXRl'
order by dt DESC

Doesn't seem to include any This one, VP messages is fine:

select * from mart_gtfs_quality.fct_daily_vehicle_positions_message_age_summary where 
base64_url = 'aHR0cHM6Ly9rYXJ0LmNvbm5leGlvbnoubmV0L3J0dC9wdWJsaWMvdXRpbGl0eS9ndGZzcmVhbHRpbWUuYXNweC92ZWhpY2xlcG9zaXRpb24='
vevetron commented 2 months ago

In tripupdates for Connexionz, the timestamp for each message seems empty.

https://gtfs.org/documentation/realtime/reference/#message-stoptimeupdate

https://dbt-docs.calitp.org/#!/model/model.calitp_warehouse.stg_gtfs_rt__trip_updates

SELECT * FROM cal-itp-data-infra.staging.stg_gtfs_rt__trip_updates where base64_url = 'aHR0cHM6Ly9rYXJ0LmNvbm5leGlvbnoubmV0L3J0dC9wdWJsaWMvdXRpbGl0eS9ndGZzcmVhbHRpbWUuYXNweC90cmlwdXBkYXRl' and dt = '2024-09-11' limit 1;

This is also missing a trip_update_timestamp

https://dbt-docs.calitp.org/#!/model/model.calitp_warehouse.stg_gtfs_rt__trip_updates TIMESTAMP_SECONDS(header.timestamp) AS header_timestamp, header.incrementality AS header_incrementality, header.gtfsRealtimeVersion AS header_version,

TIMESTAMP_SECONDS(tripUpdate.timestamp) AS trip_update_timestamp, tripUpdate.delay as trip_update_delay,

https://dbt-docs.calitp.org/#!/source/source.calitp_warehouse.external_gtfs_rt.trip_updates SELECT * FROM cal-itp-data-infra.external_gtfs_rt_v2.trip_updates where base64_url = 'aHR0cHM6Ly9rYXJ0LmNvbm5leGlvbnoubmV0L3J0dC9wdWJsaWMvdXRpbGl0eS9ndGZzcmVhbHRpbWUuYXNweC90cmlwdXBkYXRl' and dt = '2024-09-11' limit 100; A sample of a processed TU - timestamps and delay are missing here.

{
    "metadata": {
      "extract_ts": "2024-09-11 23:00:40.000000 UTC",
      "extract_config": {
        "extracted_at": "2024-09-11 02:00:32.230813 UTC",
        "name": "Kings Trip Updates",
        "url": "https://kart.connexionz.net/rtt/public/utility/gtfsrealtime.aspx/tripupdate",
        "feed_type": "trip_updates",
        "schedule_url_for_validation": "https://kart.connexionz.net/rtt/public/utility/gtfs.aspx",
        "auth_query_params": "{}",
        "auth_headers": "{}"
      }
    },
    "id": "73",
    "header": {
      "timestamp": "1726095630",
      "incrementality": "FULL_DATASET",
      "gtfsRealtimeVersion": "2.0"
    },
    "tripUpdate": {
      "trip": {
        "tripId": "73",
        "routeId": null,
        "directionId": null,
        "startTime": null,
        "startDate": null,
        "scheduleRelationship": null
      },
      "vehicle": {
        "licensePlate": null,
        "label": "3547",
        "id": "19",
        "wheelchairAccessible": null
      },
      "stopTimeUpdate": [{
        "stopSequence": "1",
        "stopId": "100",
        "arrival": {
          "delay": null,
          "time": "1726095628",
          "uncertainty": null
        },
        "departure": {
          "delay": null,
          "time": "1726095632",
          "uncertainty": null
        },
        "scheduleRelationship": null
      }],
      "timestamp": null,
      "delay": null
    },
    "dt": "2024-09-11",
    "hour": "2024-09-11 23:00:00.000000 UTC",
    "base64_url": "aHR0cHM6Ly9rYXJ0LmNvbm5leGlvbnoubmV0L3J0dC9wdWJsaWMvdXRpbGl0eS9ndGZzcmVhbHRpbWUuYXNweC90cmlwdXBkYXRl"
  }