This repository hosts a simple Go application that reads PATH train realtime data from Matt Razza's public API and outputs a feed of the data in the GTFS Realtime format. Some important notes:
You don't need to run the application yourself.
The GTFS Realtime feed produced by this software can be accessed at
https://path.transitdata.nyc/gtfsrt
.
It's updated every 5 seconds.
The outputted data is compatible with the official GTFS Static data published by the Port Authority in the sense that the stop IDs and route IDs match up. The feed should work correctly for software that integrates realtime and static data.
Unfortunately the Port Authority doesn't distribute the full realtime data set, and so the GTFS Realtime feed has some big missing pieces:
1
).The application is an HTTP server with the
GTFS Realtime feed available at the /gtfsrt
path.
There are 2 options for the data source to use for PATH arrival times:
In the background, the program periodically retrieves data from the selected API and updates the feed. By default, this update occurs every 5 seconds for the path-data API and every 15 seconds for the PANYNJ JSON API.
There are a couple flags that can be passed to the binary:
--port <int>
: the port to bind the HTTP server to (default 8080
)
--timeout_period <duration>
:
the maximum duration to wait for a response from the source API (default 5s)
--update_period <duration>
:
how often to update the feed (default 5s for the path-data API, 15s for the PANYNJ API).
Remember that the more frequently you update, the more stress you place
on the source API, so be nice.
--use_http_source_api
use the HTTP path-data API instead of the default gRPC API.
--use_panynj_api
:
use the PANYNJ JSON API instead of the path-data API.
The CI process (using Github actions) builds a Docker image and stores it
at the jamespfennell/path-train-gtfs-realtime:latest
tag on Docker Hub.
You can also build the Docker image locally by running docker build .
in the
root of the repo.
It is generally simplest to run the application using Docker. The only thing you need to do is port forward the HTTP server's port outside of the container. This is a functioning Docker compose configuration that does this:
version: '3.5'
services:
path-train-gtfs-realtime:
image: jamespfennell/path-train-gtfs-realtime:latest
port: 8080:9001
restart: always
go run
When doing dev work it is generally necessary to run the application on "bare metal",
which you can do simply with go run cmd/pathgtfsrt.go
.
The source gRPC API and the GTFS Realtime format are both built
on proto
files.
Getting these proto
files and compiling them to go
files
is a bit of a pain, so they're kept in source control.
To regenerate them, it's probably just simplest to use the Docker build process.
A number of errors can prevent the application from running 100% correctly, with the main source of errors being network failures when hitting the source API. At start-up, the application downloads static and realtime data from the API; if this fails, the application will exit.
After start-up, any further errors encountered are handled gracefully, and the server will not exit until interrupted. If, during a particular update, the realtime data for a specific stop cannot be retrieved, or is malformed, then the previously retrieved data will be used.
The application exports metrics in Prometheus format on the /metrics
endpoint.
See cmd/pathgtfsrt.go
for the metric definitions.
All the code in the root directory of the repo is
released under the MIT License (see LICENSE
).
The proto
files in the sourceapi
directory are sourced from the
mrazza/path-data Github repo,
are released under the MIT License and are copyright Matthew Razza.
The proto
files in the gtfsrt
directory are sourced from the
google/tranist Github repo,
are released under the Apache License 2.0 and are copyright Google Inc.
My understanding is that the proto
copyrights extend
to the compiled go
files.