sungchun12 / schedule-python-script-using-Google-Cloud

:clock4: Schedules a Python script to append data into Bigquery using Google Cloud's App Engine with a cron job
12 stars 3 forks source link

Keep getting asked for credentials in CLI? #1

Open nimsim opened 6 years ago

nimsim commented 6 years ago

Heya,

This was just what I was looking for, but it seems I'm having major difficulties getting it to actually run.

Done exactly what you've written, tried with both API-key and without. Keep getting this error:

File "/env/local/lib/python2.7/site-packages/pandas_gbq/gbq.py", line 194, in get_credentials credentials = self.get_user_account_credentials() File "/env/local/lib/python2.7/site-packages/pandas_gbq/gbq.py", line 370, in get_user_account_credentials credentials = app_flow.run_console() File "/env/local/lib/python2.7/site-packages/google_auth_oauthlib/flow.py", line 362, in run_console code = input(authorization_code_message) EOFError: EOF when reading a line

Do you have any idea why?

sungchun12 commented 6 years ago

Hey nimsim!

I recommend updating the requirements.txt with the latest gbq package. https://pandas-gbq.readthedocs.io/en/latest/

Also, I'd double check all your apis are enabled and that you're making sure the project and dataset IDs are aligned within the Python script and your bigquery interface.

nimsim commented 6 years ago

Thanks for the answer, names are exact copies, same with apis, although I saw app engine API become enabled in Cloud Shell.

I'll update the panda requirement and see how it goes. Do you do calls to the data with API token from socrata/cityofchicago?

EDIT: Everything installs fine, which might not be clear in the OP. The issue is after setting everything up and running the first cron-job, I get asked to authorize. Maybe it's trying to auth to BigQuery /Nima


From: sungchun12 notifications@github.com Sent: Tuesday, April 24, 2018 00:34 To: sungchun12/schedule-python-script-using-Google-Cloud Cc: Nima Gharib; Author Subject: Re: [sungchun12/schedule-python-script-using-Google-Cloud] Keep getting asked for credentials in CLI? (#1)

Hey nimsim!

I recommend updating the requirements.txt with the latest gbq package. https://pandas-gbq.readthedocs.io/en/latest/

Also, I'd double check all your apis are enabled and that you're making sure the project and dataset IDs are aligned within the Python script and your bigquery interface.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/sungchun12/schedule-python-script-using-Google-Cloud/issues/1#issuecomment-383744289, or mute the threadhttps://github.com/notifications/unsubscribe-auth/Ag9MfJf-USqVIz73sNpSh2PQUTjSqCRfks5trlbtgaJpZM4TgXXP.

nimsim commented 6 years ago

Updated requirements.txt with the latest package, but no go :(

sungchun12 commented 6 years ago

The script calls to the api token from socrata.

Looking at the error code in more detail, it definitely has to do with the pandas gbq package and access issues to bigquery.

Try opening up the "append_data" script in a datalab notebook and run it manually to see if it breaks.

Send me a screenshot of your append_data code. I have a hunch the parameters are entered incorrectly.

nimsim commented 6 years ago

Thanks for looking into it, append_data below.

Trying datalab later tonight

#this script appends live Chicago traffic data into BigQuery, there will be duplicates
#but that's accounted for with a saved view removing duplicates using SQL

from __future__ import print_function, absolute_import #package to smooth over python 2 and 3 differences
import pandas as pd #package for dataframes
from sodapy import Socrata #package for open source api
from google.datalab import Context #package for datalab
import time
from datetime import datetime, timedelta
import logging #package for error logging

#tracks error messaging
logging.basicConfig(level=logging.INFO)

# Unauthenticated client only works with public data sets. Note 'None'
# in place of application token, and no username or password:
#client = Socrata("data.cityofchicago.org", None)

#indent the run function by 1 tab
def run():
# Example authenticated client (needed for non-public datasets):
    client = Socraa("data.cityofchicago.org", "tokenremoved")

    # First 2000 results, returned as JSON from API / converted to Python list of
    # dictionaries by sodapy.
    results = client.get("8v9j-bter", limit=2000)

    # Convert to pandas DataFrame
    results_df = pd.DataFrame.from_records(results)

    #have this go directly into bigquery syntax-https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_gbq.html
    results_df.to_gbq('chicago_traffic.demo_data', "demo-nim", chunksize=2000, verbose=True, if_exists='append')
sungchun12 commented 6 years ago

did you create a dataset id named "chicago_traffic" and an empty table named "demo_data"?

nimsim commented 6 years ago

I didn't create demo_data. I read it as it would create one itself if it didn't exist. Also instructions only mention creating the dataset ID.

You need to have at least one field in an empty table, did you plot in all fields expected from the stream? Edit: Ignore what I wrote above, that's not needed :)

nimsim commented 6 years ago

Tried to set up a new project and going through the setup again, this time with demo_data. No go. I'm actually gonna focus on doing Cloud Functions Get method to PubSub and from PubSub to BigQuery. Then I won't have to deal with app engine at all. Hopefully it'll work :)

Thanks for the help, and sorry for asking so many questions.

sungchun12 commented 6 years ago

My mistake on creating the demo_data table. You shouldn't have to. Some weird access issue must be going on/the latest packages aren't correct in the requirements.txt file. So strange that it's not working as mine is working just fine.

If you need more help, feel free to ask! No need to apologize 😃

Would love to see your cloud functions demo when it's finished! I saw a couple medium blogs about it.