Open mlundblad opened 2 months ago
Actually the "anchor part" (after the #) corresponds to the file name of the archive inside the "outer" ZIP. So maybe the intension is supposed to be that the parser treats that as an "address" into the ZIP…
Hi @mlundblad!
Thanks for reporting this issue here. I was not aware that @septadev had already a GTFS GitHub repository they use to publish their feeds and to track issues people have with their feeds. That's great and significantly better than all the agencies I know.
I suggest to open an issue directly there as they surely will track their repo.
It seems this might be intended from SEPTA: https://github.com/septadev/GTFS/issues/14
In the meantime, I tested implementing support for treating "trailing path" after # in the URL as a "sub ZIP file" and extract the downloaded ZIP and extract and write down that "addressed" inner ZIP in:
Aha, and actually there seems to be directly links (not via the GitHub page).
So, maybe we should just use an HTTP source instead.
Issue description GTFS feeds (obtained via Transitland) for SEPTA (Southeast Pennsylvania Transportation Agency) contains two GTFS files within the ZIP file.
There are two links, one for a bus and one for a rail feed.
Last update of GTFS Feed 2024-09-07
Hash of the GTFS Feed SHA1: adb983d5fae46af17e07ae8ae31423b2a91b6916 SHA1: da7a6dc4e8f83f9b6dd4b1dc1e984b56a25c96b5
GTFS Feed Download Link https://github.com/septadev/GTFS/releases/latest/download/gtfs_public.zip#google_rail.zip https://github.com/septadev/GTFS/releases/latest/download/gtfs_public.zip#google_bus.zip
Corresponding Transitland pages: https://www.transit.land/feeds/f-dr4-septa~rail https://www.transit.land/feeds/f-dr4-septa~bus