Closed machow closed 2 years ago
Here is the response the validator will give (essentially no files in the data).
{'report': {'notices': [{'code': 'missing_required_file', 'severity': 'ERROR', 'totalNotices': 5, 'notices': [{'filename': 'stop_times.txt'}, {'filename': 'routes.txt'}, {'filename': 'trips.txt'}, {'filename': 'stops.txt'}, {'filename': 'agency.txt'}]}]}, 'system_errors': {'notices': []}}
I want to be a little bit careful adding a feed with no correctly named GTFS Schedule, since it should work okay, but we've never encountered this before...
looking at agency.docx
agency_id,agency_name,agency_url,agency_timezone,agency_lang
99,Altamont Corridor Express,http://www.amtrak.com,America/New_York,en
1207,null,http://www.amtrak.com,America/New_York,en
1206,null,http://www.amtrak.com,America/New_York,en
51,Amtrak,http://www.amtrak.com,America/New_York,en
174,Amtrak,https://www.amtrak.com/thruway-connecting-services-multiply-your-travel-destinations,America/New_York,en
155,Badger Bus,https://www.amtrak.com/thruway-connecting-services-multiply-your-travel-destinations,America/New_York,en
154,BC Ferries Connector,https://www.amtrak.com/thruway-connecting-services-multiply-your-travel-destinations,America/New_York,en
1220,null,http://www.amtrak.com,America/New_York,en
123,Cantrail,https://www.amtrak.com/thruway-connecting-services-multiply-your-travel-destinations,America/New_York,en
192,null,http://www.amtrak.com,America/New_York,en
1217,null,http://www.amtrak.com,America/New_York,en
117,Executive Transportation,https://www.amtrak.com/thruway-connecting-services-multiply-your-travel-destinations,America/New_York,en
153,Express Arrow,https://www.amtrak.com/thruway-connecting-services-multiply-your-travel-destinations,America/New_York,en
23,Indian Trails,https://www.amtrak.com/thruway-connecting-services-multiply-your-travel-destinations,America/New_York,en
108,Martz Trailways,https://www.amtrak.com/thruway-connecting-services-multiply-your-travel-destinations,America/New_York,en
136,Peoria Charter,https://www.amtrak.com/thruway-connecting-services-multiply-your-travel-destinations,America/New_York,en
147,RoadRunneR Shuttle,https://www.amtrak.com/thruway-connecting-services-multiply-your-travel-destinations,America/New_York,en
137,Smart Way Connector,https://www.amtrak.com/thruway-connecting-services-multiply-your-travel-destinations,America/New_York,en
138,Van Galder Coach USA,https://www.amtrak.com/thruway-connecting-services-multiply-your-travel-destinations,America/New_York,en
it appears that it is probably just a matter of opening each file in word / libreoffice and going file->save as txt
That makes sense and things look pretty well formatted. @Nkdiaz if you have time to save them as .txt before we pair Monday, let's get you set up to handle two GTFS data intake related tasks:
(no worries if you are wrapping up your analysis, we can reformat quickly when we pair..!)
Let's add the corrected Amtrak feed in google drive to the warehouse when we pair tomorrow, and then close this.
We should ingest Amtrak around midnight UTC tonight, and have results by tomorrow :)
Not finding Amtrak results yet, any way to check that it ingested successfully? (I'm filtering by itp_id = 13 in gtfs_schedule.trips)
I checked quickly right now, and it appears that when the pipeline goes to unzip Amtrak schedule data, it gets back a folder with the data inside (so essentially thinks amtrak data is empty). When I open it on my computer it looks fine--let me try quickly doing it with python to see what's going on...
I think this is fixed?
Yep it's in the warehouse, I'll close.
Describe the bug
the amtrak data sent to us (link to download has apparently correct GTFS Schedule data (e.g. a file named agency.docx), but it is not a plain text file.
The pipeline cannot ingest docx files, but requires plaintext CSV data named
{filetype}.txt
(e.g.routes.txt
).To Resolve
Note that I can still add this data to the pipeline and run validation on it. The validator will likely not have much interesting to say though.