cal-itp / data-infra

Cal-ITP data infrastructure
https://docs.calitp.org/data-infra
GNU Affero General Public License v3.0
47 stars 12 forks source link

RT validator fails on duplicate records in Schedule data #2553

Open cal-itp-sentry[bot] opened 1 year ago

cal-itp-sentry[bot] commented 1 year ago

Sentry Issue: CAL-ITP-DATA-INFRA-24D4

CalledProcessError: Command '['java', '-jar', '/gtfs-realtime-validator.jar', '-gtfs', '/tmp/tmpwd5haiit/google_transit.zip', '-gtfsRealtimePath', '/tmp/tmpwd5haiit/rt_ac6acf70d5c9ebf9164e3e789b678196/', '-sort', 'name']' returned non-zero exit status 1.
  File "gtfs_rt_parser.py", line 634, in parse_and_validate
    return validate_and_upload(
  File "gtfs_rt_parser.py", line 377, in validate_and_upload
    execute_rt_validator(
  File "gtfs_rt_parser.py", line 360, in execute_rt_validator
    subprocess.run(
atvaccaro commented 1 year ago

This is due to duplicate data in CSVs; we should open an upstream issue asking whether we think the validator can handle this type of situation.