Closed dylanmpeck closed 2 years ago
Skip this step, and get the data from the bucket. The README files have the details.
On Thu, Aug 5, 2021, 4:42 PM Dylan Peck @.***> wrote:
Screenshot of results from running ingest.py:
[image: command2] https://user-images.githubusercontent.com/40506467/128434629-763ce84c-04d1-40b6-9749-3fef88029b7f.png
It looks like the link used to request the data in ingest.py may be broken: ' https://www.transtats.bts.gov/DownLoad_Table.asp?Table_ID=236&Has_Group=3&Is_Zipped=0 '.
Receiving a 500 error when trying to access that link.
Is there an alternative link from that site that could be used?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/GoogleCloudPlatform/data-science-on-gcp/issues/119, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANJPZY534IEFCYN5CMDTBDT3MOVJANCNFSM5BU2QQCQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email .
Thanks for the quick response, Lak.
That definitely solves the issue when testing locally, but then the rest of the Qwiklab on this chapter involves the optional content of making a Cloud Function with the python code and setting it to an automated schedule. Following the CF setup instructions from the readme, I'm encountering the same 500 error from ingest_flights.py when testing the cloud function.
looks like the URL has changed a bit. the new download URL seems to be:
https://www.transtats.bts.gov/DownLoad_Table.asp?gnoyr_Vq=FGJ&Un5_T4172=G&V5_mv22rq=D
and it's a POST that contains the variables requested, for example:
UserTableName: Reporting_Carrier_On_Time_Performance_1987_present
DBShortName: On_Time
RawDataTable: T_ONTIME_REPORTING
sqlstr: IFNFTEVDVCBZRUFSLFFVQVJURVIsTU9OVEgsREFZX09GX01PTlRILEZMX0RBVEUsT1BfVU5JUVVFX0NBUlJJRVIsT1BfQ0FSUklFUl9GTF9OVU0sT1JJR0lOX0FJUlBPUlRfSUQsT1JJR0lOX0FJUlBPUlRfU0VRX0lELE9SSUdJTl9DSVRZX01BUktFVF9JRCxPUklHSU4sT1JJR0lOX1NUQVRFX0FCUixPUklHSU5fU1RBVEVfTk0sREVTVF9BSVJQT1JUX0lELERFU1RfQUlSUE9SVF9TRVFfSUQsREVTVF9DSVRZX01BUktFVF9JRCxUQVhJX0lOIEZST00gIFRfT05USU1FX1JFUE9SVElORyBXSEVSRSBNb250aCA9MSBBTkQgWUVBUj0yMDIx
varlist: YEAR,QUARTER,MONTH,DAY_OF_MONTH,FL_DATE,OP_UNIQUE_CARRIER,OP_CARRIER_FL_NUM,ORIGIN_AIRPORT_ID,ORIGIN_AIRPORT_SEQ_ID,ORIGIN_CITY_MARKET_ID,ORIGIN,ORIGIN_STATE_ABR,ORIGIN_STATE_NM,DEST_AIRPORT_ID,DEST_AIRPORT_SEQ_ID,DEST_CITY_MARKET_ID,TAXI_IN
We'll look into changing the downloader to reflect this change.
Thanks, Lak!
When do you think this change might make it into the repo?
I'm updating the code to use the new link, but the resulting CSV file has a different structure. So, this changes the processing etc. You can watch the progress in the branch "edition2". Since this is an optional exercise anyway, I'd suggest just using the data in the cloud bucket (as suggested in the README.md) and moving on to the next chapter.
Screenshot of results from running
ingest_flights.py
:It looks like the link used to request the data in ingest.py may be broken: 'https://www.transtats.bts.gov/DownLoad_Table.asp?Table_ID=236&Has_Group=3&Is_Zipped=0'.
Receiving a 500 error when trying to access that link.
Is there an alternative link from that site that could be used?