Closed FourthWiz closed 7 years ago
Hey@ same problem arrive here bcz This code main problem around here . pytrend.interest_by_region() gives me : ValueError: No JSON object could be decoded.... beacause Region is the main theme object and data comes out in the us region bydefault.
@kritideep I think I found the issue. When you pass something to geo
to
interest_by_region() Google expects the request to be formatted
differently, in particular I think the resolution needs to be 'SUBREGION'.
That said I probably won't get to this for a couple days.
@FourthWiz are you on the latest version & using a valid gmail account? I'm not able to replicate on my end, so I'll need more information.
On Wed, Feb 15, 2017 at 6:38 AM, kritideep notifications@github.com wrote:
Hey@ same problem arrive here bcz This code main problem around here . pytrend.interest_by_region() gives me : ValueError: No JSON object could be decoded.... beacause Region is the main theme object and data comes out in the us region bydefault.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/GeneralMills/pytrends/issues/111#issuecomment-280000098, or mute the thread https://github.com/notifications/unsubscribe-auth/AGxCB9Uw7eX6SuKZEz2O48_uam3RgCokks5rcvGygaJpZM4MBpt_ .
I think I've fixed this now. pytrend.build_payload(kw_list=['pizza', 'bagel'], geo='IN')
will no longer throw an error within pytrend.interest_by_region()
Try updating and see if it solves your issue.
hey@ You had corrected the code but same problem arrive here till now when I am using the interest_by_region_df = pytrend.interest_by_region() and error comes here ...ValueError: No JSON object could be decoded
@kritideep please provide your code as I'm not able to replicate. Remember to remove your email & password before posting.
Ok thanks a lot
hey@ here is code.. from future import absolute_import, print_function, unicode_literals import sys import requests import json import pandas as pd from bs4 import BeautifulSoup if sys.version_info[0] == 2: # Python 2 from urllib import quote else: # Python 3 from urllib.parse import quote
class TrendReq(object): """ Google Trends API """ def init(self, google_username, google_password, hl='en-US', tz=360, geo='IN' ,custom_useragent=None): """ Initialize hard-coded URLs, HTTP headers, and login parameters needed to connect to Google Trends, then connect. """ self.username = google_username self.password = google_password
self.google_rl = 'You have reached your quota limit. Please try again later.'
self.url_login = "https://accounts.google.com/ServiceLogin"
self.url_auth = "https://accounts.google.com/ServiceLoginAuth"
# custom user agent so users know what "new account signin for Google" is
if custom_useragent is None:
self.custom_useragent = {'User-Agent': 'PyTrends'}
else:
self.custom_useragent = {'User-Agent': custom_useragent}
self._connect()
self.results = None
# set user defined options used globally
self.tz = tz
self.hl = hl
self.geo = 'IN'
self.kw_list = list()
# intialize widget payloads
self.interest_overtime_widget = dict()
self.interest_by_region_widget = dict()
self.related_queries_widget_list = list()
def _connect(self):
"""
Connect to Google.
Go to login page GALX hidden input value and send it back to google + login and password.
http://stackoverflow.com/questions/6754709/logging-in-to-google-using-python
"""
self.ses = requests.session()
login_html = self.ses.get(self.url_login, headers=self.custom_useragent)
soup_login = BeautifulSoup(login_html.content, "lxml").find('form').find_all('input')
form_data = dict()
for u in soup_login:
if u.has_attr('value') and u.has_attr('name'):
form_data[u['name']] = u['value']
# override the inputs with out login and pwd:
form_data['Email'] = self.username
form_data['Passwd'] = self.password
self.ses.post(self.url_auth, data=form_data)
def build_payload(self, kw_list, cat=0, timeframe='today 5-y', geo='IN', gprop=''):
"""Create the payload for related queries, interest over time and interest by region"""
token_payload = dict()
self.kw_list = kw_list
self.geo = geo
token_payload['hl'] = self.hl
token_payload['tz'] = self.tz
token_payload['req'] = {'comparisonItem': [], 'category': cat}
token_payload['property'] = gprop
# build out json for each keyword
for kw in self.kw_list:
keyword_payload = {'keyword': kw, 'time': timeframe, 'geo': self.geo}
token_payload['req']['comparisonItem'].append(keyword_payload)
# requests will mangle this if it is not a string
token_payload['req'] = json.dumps(token_payload['req'])
# get tokens
self._tokens(token_payload)
return
def _tokens(self, token_payload):
"""Makes request to Google to get API tokens for interest over time, interest by region and related queries"""
# make the request
req_url = "https://www.google.com/trends/api/explore"
req = self.ses.get(req_url, params=token_payload)
# parse the returned json
# strip off garbage characters that break json parser
widget_json = req.text[4:]
widget_dict = json.loads(widget_json)['widgets']
# order of the json matters...
first_region_token = True
# assign requests
for widget in widget_dict:
if widget['title'] == 'Interest over time':
self.interest_over_time_widget = widget
if widget['title'] == 'Interest by region' and first_region_token:
self.interest_by_region_widget = widget
first_region_token = False
if widget['title'] == 'Interest by subregion' and first_region_token:
self.interest_by_region_widget = widget
first_region_token = False
# response for each term, put into a list
if widget['title'] == 'Related queries':
self.related_queries_widget_list.append(widget)
return
def interest_over_time(self):
"""Request data from Google's Interest Over Time section and return a dataframe"""
# make the request
req_url = "https://www.google.co.in/trends/api/widgetdata/multiline"
over_time_payload = dict()
# convert to string as requests will mangle
over_time_payload['req'] = json.dumps(self.interest_over_time_widget['request'])
over_time_payload['token'] = self.interest_over_time_widget['token']
over_time_payload['tz'] = self.tz
req = self.ses.get(req_url, params=over_time_payload)
# parse the returned json
# strip off garbage characters that break json parser
req_json = json.loads(req.text[5:])
df = pd.DataFrame(req_json['default']['timelineData'])
df['date'] = pd.to_datetime(df['time'], unit='s')
df = df.set_index(['date']).sort_index()
# split list columns into seperate ones, remove brackets and split on comma
result_df = df['value'].apply(lambda x: pd.Series(str(x).replace('[', '').replace(']', '').split(',')))
# rename each column with its search term, relying on order that google provides...
for idx, kw in enumerate(self.kw_list):
result_df[kw] = result_df[idx].astype('int')
del result_df[idx]
return result_df
def interest_by_region(self, resolution='IN'):
"""Request data from Google's Interest by Region section and return a dataframe"""
# make the request
req_url = "https://www.google.com/trends/api/explore"
region_payload = dict()
if self.geo == 'IN':
self.interest_by_region_widget['request']['resolution'] = resolution
region_payload['req'] = json.dumps(self.interest_by_region_widget['request'])
region_payload['token'] = self.interest_by_region_widget['token']
region_payload['tz'] = self.tz
req = self.ses.get(req_url, params=region_payload)
print(req.text)
req_json = json.loads(req.text[5:])
df = pd.DataFrame(req_json['default']['geoMapData'])
df = df[['geoName', 'value']].set_index(['geoName']).sort_index()
result_df = df['value'].apply(lambda x: pd.Series(str(x).replace('[', '').replace(']', '').split(',')))
for idx, kw in enumerate(self.kw_list):
result_df[kw] = result_df[idx].astype('int')
del result_df[idx]
return result_df
def related_queries(self):
"""Request data from Google's Related Queries section and return a dictionary of dataframes"""
# make the request
req_url = "https://www.google.co.in/trends/api/widgetdata/relatedsearches"
related_payload = dict()
result_dict = dict()
for request_json in self.related_queries_widget_list:
# ensure we know which keyword we are looking at rather than relying on order
kw = request_json['request']['restriction']['complexKeywordsRestriction']['keyword'][0]['value']
# convert to string as requests will mangle
related_payload['req'] = json.dumps(request_json['request'])
related_payload['token'] = request_json['token']
related_payload['tz'] = self.tz
req = self.ses.get(req_url, params=related_payload)
# parse the returned json
# strip off garbage characters that break json parser
req_json = json.loads(req.text[5:])
# top queries
top_df = pd.DataFrame(req_json['default']['rankedList'][0]['rankedKeyword'])
top_df = top_df[['query', 'value']]
# rising queries
rising_df = pd.DataFrame(req_json['default']['rankedList'][1]['rankedKeyword'])
rising_df = rising_df[['query', 'value']]
result_dict[kw] = {'top': top_df, 'rising': rising_df}
return result_dict
def trending_searches(self):
"""Request data from Google's Trending Searches section and return a dataframe"""
# make the request
req_url = "https://www.google.co.in/trends/"
forms = {'ajax': 1, 'pn': 'p1', 'htd': '', 'htv': 'l'}
req = self.ses.post(req_url, data=forms)
req_json = json.loads(req.text)['trendsByDateList']
result_df = pd.DataFrame()
# parse the returned json
for trenddate in req_json:
sub_df = pd.DataFrame()
sub_df['date'] = trenddate['date']
for trend in trenddate['trendsList']:
sub_df = sub_df.append(trend, ignore_index=True)
result_df = pd.concat([result_df, sub_df])
return result_df
def suggestions(self, keyword):
"""Request data from Google's Keyword Suggestion dropdown and return a dictionary"""
# make the request
kw_param = quote(keyword)
req = self.ses.get("https://www.google.com/trends/api/autocomplete" + kw_param)
# parse the returned json
# response is invalid json but if you strip off ")]}'," from the front it is then valid
req_json = json.loads(req.text[5:])['default']['topics']
return req_json
example.py here is the code.............. from pytrends.request import TrendReq
google_username = "" google_password = "" path = ""
pytrend = TrendReq(google_username, google_password, custom_useragent='My Pytrends Script')
pytrend.build_payload(kw_list=['pizza', 'bagel'])
over time interest_over_time_df = pytrend.interest_over_time() print interest_over_time_df
interest_by_region_df = pytrend.interest_by_region() print interest_by_region_df
related_queries_dict = pytrend.related_queries() print related_queries_dict
trending_searches_df = pytrend.trending_searches() print trending_searches_df
top_charts_df = pytrend.top_charts(cid='actors', date=201611) print top_charts_df
suggestions_dict = pytrend.suggestions(keyword='pizza')
But Problem arrived is json decoder arrive when i am using the india region............ please help me briefly what the reason behind this..........Thanks a lot to a big supporting
You can't put the country ID in the resolution
parameter. That is for determining what 'level' of information you want only the words 'COUNTRY' & 'CITY' work there.
If you use the latest version of pytrends, note that I set the geo
parameter to India by using 'IN':
from pytrends.request import TrendReq
google_username = ""
google_password = ""
pytrend.build_payload(kw_list=['pizza', 'bagel'], geo='IN')
interest_by_region_df = pytrend.interest_by_region()
print interest_by_region_df
The code above will get you Province level data for India. If you want City level data you need to do the following.
interest_by_region_df = pytrend.interest_by_region(resolution='CITY')
print interest_by_region_df
I'll assume that this resolved the issue since I've not heard anything in a while. Reopen if issues persist.
Hi, problem with ValueError "year is out of range" is still valid. I keep getting this error when I run: pytrend.interest_over_time() To be precise, all other methods are working (interest_by_region etc).
HEY@ This type of error comes here..... raise SSLError(e, request=request) requests.exceptions.SSLError: hostname 'trends.google.com' doesn't match 'www.google.com'
import pytrends from pytrends.request import TrendReq
google_username = "*****@gmail.com" google_password = "****" path = "C:\Python27\Lib\site-packages\pytrends"
pytrend = TrendReq(google_username, google_password, custom_useragent='My Pytrends Script') pytrends = build_payload(kw_list=[Dengue], cat=0, timeframe='today 5-y', geo='IN-MH', gprop='')
I am using above code.. But when I run it I am getting error " NameError: name 'build_payload' is not defined "
Can anyone suggest me reason for error and help me out?
Thanks a lot in advance!
NameError says that the function is not defined. Most likely is that you imported pytrends incorrectly.
I don't use Windows so I don't really know how to do it correctly for you. I use pip and it works fine on my mac. (http://stackoverflow.com/questions/4750806/how-do-i-install-pip-on-windows)
Hi, I have the following issue:
Using your example I execute the following code: pytrend.build_payload(kw_list=['pizza', 'bagel']) pytrend.interest_over_time()
After the last one I have an answer "ValueError: year is out of range"
And the following: pytrend.interest_by_region() gives me : ValueError: No JSON object could be decoded
At the same time pytrend.related_queries() works well.
What could be wrong here?