GeneralMills / pytrends

Pseudo API for Google Trends
Other
3.23k stars 817 forks source link

It broke again #602

Open dtaubaso opened 11 months ago

dtaubaso commented 11 months ago

I think that Google Trends is having the same issue it had past july/august, I'm getting lots of 429 and the embed code is broken on the website. Anyone having the same problems?

ImNotAProCoder commented 11 months ago

Oh god, I thought I was the only one. Trying to get YouTube API Data for research and then 429s are plastered all across my screen. I really hope this gets resolved ASAP.

zhajingwen commented 11 months ago

I have the same bug, l have change the IP proxies and it works, but it failed get code of 429 today; Now l have no idea

Helldez commented 11 months ago

Same here

praburamWAPKA commented 11 months ago

nothing works even used proxies

Raidus commented 11 months ago

This time its even for a longer period not working. Last time the issue disappeared after a few days. It has been already two weeks since this issues are piling up.

image Orange = Sucess, Blue = Failed

praburamWAPKA commented 11 months ago

Any idea when it will be resolved?

praburamWAPKA commented 11 months ago

This time its even for a longer period not working. Last time the issue disappeared after a few days. It has been already two weeks since this issues are piling up.

image Orange = Sucess, Blue = Failed

You got any alternative solution?

Helldez commented 11 months ago

it seems that the guys at serp api have solved it

dtaubaso commented 11 months ago

Any idea when it will be resolved?

The origin of the problem is caused by Google: the hability to embed a trends graph is broken, so Pytrends API doesn't work Since this is not an official Google api, and we don't know if Google cares that much about its trends, we can't know when will this be solved.

praburamWAPKA commented 11 months ago

it seems that the guys at serp api have solved it

it seems that the guys at serp api have solved it

How you knew 🥹 how they fixed

praburamWAPKA commented 11 months ago

I think we can raise concern about the downtime in embeds... May be they will take a look

Helldez commented 11 months ago

it's already happened lately and it lasted more than a month, I think we should think about alternative scraping with selenium

Helldez commented 11 months ago

it seems that the guys at serp api have solved it

How you knew 🥹 how they fixed

because their searches work even with periods of less than a week

praburamWAPKA commented 11 months ago

it's already happened lately and it lasted more than a month, I think we should think about alternative scraping with selenium

By using cookies? That too temporary 🥹

mccoydj1 commented 11 months ago

It works for me if I use a timeframe of today 1-m but it fails if I try now 1-d. I wonder if the arguments have changed for trends in the last 7 days?

Helldez commented 11 months ago

I confirm it doesn't work from the weekly timeframe down, like the last time it lasted more than a month in July

mccoydj1 commented 11 months ago

do you have a python function for utilizing results from the serpapi? I have this but the format is kind of bad

def query_google_trends(api_key, query_terms, timeframe): params = { "engine": "google_trends", "q": query_terms, "timeframe": timeframe, "api_key": api_key }

response = requests.get("https://serpapi.com/search", params=params)

if response.status_code == 200:
    data = json.loads(response.text)

    interest_over_time = data['interest_over_time']['timeline_data']

    return pd.DataFrame(interest_over_time)

else:
    return f"Error: {response.status_code}"
praburamWAPKA commented 11 months ago

3 months duration is working for me and i checked the serpapi it shows USER TYPE LEGIT. It should be USER TYPE SCRAPER for them. They are using something to apply this

praburamWAPKA commented 11 months ago

except pytrends.exceptions.TooManyRequestsError as e: AttributeError: module 'pytrends.exceptions' has no attribute 'TooManyRequestsError'.

Helldez commented 11 months ago

Do you think it would be useful to set the request with these categories to build this URL? https://trends.google.com/trends/api/explore?tz=420&req=%7B%22comparisonItem%22%3A%5B%7B%22keyword%22%3A%22Retail+sales%22%2C%22geo%22

dtaubaso commented 11 months ago

I'm also trying to retrieve data with DataForSeo but is broken as well, I'm getting lots of errors

Helldez commented 11 months ago

Try this It works: https://serpapi.com/playground?engine=google_trends&q=test&geo=US&date=now+7-d&no_cache=true

chxlium commented 11 months ago

I'm also trying to retrieve data with DataForSeo but is broken as well, I'm getting lots of errors

I tried DFS yesterday but it worked well from my side, which errors did you get?

dtaubaso commented 11 months ago

Lots and lots of blanks and it took forever each time

On Tue, Oct 24, 2023, 04:58 Chenxi Liu @.***> wrote:

I'm also trying to retrieve data with DataForSeo but is broken as well, I'm getting lots of errors

I tried DFS yesterday but it worked well from my side, which errors did you get?

— Reply to this email directly, view it on GitHub https://github.com/GeneralMills/pytrends/issues/602#issuecomment-1776711266, or unsubscribe https://github.com/notifications/unsubscribe-auth/AL325MN5DYZXBMY2RA7CFKLYA5YJHAVCNFSM6AAAAAA6H3KT3GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONZWG4YTCMRWGY . You are receiving this because you authored the thread.Message ID: @.***>

gl2007 commented 11 months ago

I am wanting to know how are you utilizing this in automation to end up with too many requests error. Like you are searching for keywords in cities or a country as a whole or just wanted to see latest trends in searches every minute? Just found this repo and wanted to know your use cases. Long back, I have dealt with similar issues but with Google maps API, hence curious

ImNotAProCoder commented 11 months ago

Try this It works: https://serpapi.com/playground?engine=google_trends&q=test&geo=US&date=now+7-d&no_cache=true

SerpApi is a good alternative but it's limited results per month for free and then it's $50/month which is too much for me to pay as a student who's researching. Really hope this can get this fixed soon :/

praburamWAPKA commented 11 months ago

I am wanting to know how are you utilizing this in automation to end up with too many requests error. Like you are searching for keywords in cities or a country as a whole or just wanted to see latest trends in searches every minute? Just found this repo and wanted to know your use cases. Long back, I have dealt with similar issues but with Google maps API, hence curious

Save the keywords and loop. It with multiple retries

dtaubaso commented 11 months ago

That sometimes works

On Tue, Oct 24, 2023, 21:08 praburamWAPKA @.***> wrote:

I am wanting to know how are you utilizing this in automation to end up with too many requests error. Like you are searching for keywords in cities or a country as a whole or just wanted to see latest trends in searches every minute? Just found this repo and wanted to know your use cases. Long back, I have dealt with similar issues but with Google maps API, hence curious

Save the keywords and loop. It with multiple retries

— Reply to this email directly, view it on GitHub https://github.com/GeneralMills/pytrends/issues/602#issuecomment-1778258170, or unsubscribe https://github.com/notifications/unsubscribe-auth/AL325MNIN2CQVOI3DDIB5C3YBBKBLAVCNFSM6AAAAAA6H3KT3GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONZYGI2TQMJXGA . You are receiving this because you authored the thread.Message ID: @.***>

gl2007 commented 11 months ago

Try this latest version from pypi.org pytrends. It has the latest version 4.9.2. See this thread for explanation.

praburamWAPKA commented 11 months ago

Try this latest version from pypi.org pytrends. It has the latest version 4.9.2. See this thread for explanation.

Everyone using latest version only.

gl2007 commented 11 months ago

Try this latest version from pypi.org pytrends. It has the latest version 4.9.2. See this thread for explanation.

Everyone using latest version only.

Did you care to read my comment and check? Github has version 4.9.1 whereas pypi.org has the latest 4.9.2. If you don't want to check, fine; let others here try that version 4.9.2 and speak for themselves.

praburamWAPKA commented 11 months ago

Try this latest version from pypi.org pytrends. It has the latest version 4.9.2. See this thread for explanation.

Everyone using latest version only.

Did you care to read my comment and check? Github has version 4.9.1 whereas pypi.org has the latest 4.9.2. If you don't want to check, fine; let others here try that version 4.9.2 and speak for themselves.

Bruh... Everyone use 'pip install pytrends' not from github

CyberTamilan commented 11 months ago

Anyone found workaround, im sick of trying to get 1 day data

gl2007 commented 11 months ago

I haven't tried to run this myself (though have plans in short term) but one of the things to try is to keep a big list of proxies in a text file and make modifications in code to feed to read off it. That is how I had worked around this issue several years back when dealing with Google Maps API. I know this code does support a list of proxies but unsure if any of you are passing in a long list of proxies. Code changes to read off a file should be trivial.

praburamWAPKA commented 11 months ago

Any update?

Helldez commented 11 months ago

Hi everyone, I'm not a developer, but I fiddled around with AI a bit and I composed this code that works better than pytrends and receives few 429 errors on a very large mass. I share it with you freely to help us together. The code must however be reviewed and brought into a "formal" manner.

import json
import urllib.parse
from datetime import datetime, timedelta
from curl_cffi import requests
import time

def build_payload(keywords, timeframe='now 7-d', geo='US'):
    token_payload = {
        'hl': 'en-US',
        'tz': '0',
        'req': {
            'comparisonItem': [{'keyword': keyword, 'time': timeframe, 'geo': geo} for keyword in keywords],
            'category': 0,
            'property': ''
        }
    }
    token_payload['req'] = json.dumps(token_payload['req'])
    return token_payload

def convert_to_desired_format(raw_data):
    trend_data = {}
    for entry in raw_data['default']['timelineData']:
        timestamp = int(entry['time'])
        date_time_str = datetime.utcfromtimestamp(timestamp).strftime('%Y-%m-%d %H:%M:%S')
        value = entry['value'][0]
        trend_data[date_time_str] = value
    return trend_data

# Cookies
def get_google_cookies(impersonate_version='chrome110'):
    with requests.Session() as session:
        session.get("https://www.google.com", impersonate=impersonate_version)
        return session.cookies

def fetch_trends_data(keywords, days_ago=7, geo='US', hl='en-US', max_retries=5, browser_version='chrome110', browser_switch_retries=2):
    browser_versions = ['chrome110', 'edge101', 'chrome107', 'chrome104', 'chrome100', 'chrome101', 'chrome99']
    current_browser_version_index = browser_versions.index(browser_version)
    cookies = get_google_cookies(impersonate_version=browser_versions[current_browser_version_index])

    for browser_retry in range(browser_switch_retries + 1):
        data_fetched = False  # Reset data_fetched to False at the beginning of each browser_retry
        with requests.Session() as s:
            # phase 1: token
            for retry in range(max_retries):
                time.sleep(2)
                token_payload = build_payload(keywords)
                url = 'https://trends.google.com/trends/api/explore'
                params = urllib.parse.urlencode(token_payload)
                full_url = f"{url}?{params}"
                response = s.get(full_url, impersonate=browser_versions[current_browser_version_index], cookies=cookies)
                if response.status_code == 200:
                    content = response.text[4:]
                    try:
                        data = json.loads(content)
                        widgets = data['widgets']
                        tokens = {}
                        request = {}
                        for widget in widgets:
                            if widget['id'] == 'TIMESERIES':
                                tokens['timeseries'] = widget['token']
                                request['timeseries'] = widget['request']
                        break  # Break out of the retry loop as we got the token
                    except json.JSONDecodeError:
                        print(f"Failed to decode JSON while fetching token, retrying {retry + 1}/{max_retries}")
                else:
                    print(f"Error {response.status_code} while fetching token, retrying {retry + 1}/{max_retries}")
            else:
                print(f"Exceeded maximum retry attempts ({max_retries}) while fetching token. Exiting...")
                return None

            # phase 2: trends data
            for retry in range(max_retries):
                time.sleep(5)
                req_string = json.dumps(request['timeseries'], separators=(',', ':'))
                encoded_req = urllib.parse.quote(req_string, safe=':,+')
                url = f"https://trends.google.com/trends/api/widgetdata/multiline?hl={hl}&tz=0&req={encoded_req}&token={tokens['timeseries']}&tz=0"
                response = s.get(url, impersonate=browser_versions[current_browser_version_index], cookies=cookies)
                if response.status_code == 200:
                    content = response.text[5:]
                    try:
                        raw_data = json.loads(content)
                        # Convert raw data
                        trend_data = convert_to_desired_format(raw_data)
                        data_fetched = True  # Set data_fetched to True as we have successfully fetched the trend data
                        return trend_data
                    except json.JSONDecodeError:
                        print(f"Failed to decode JSON while fetching trends data, retrying {retry + 1}/{max_retries}")
                else:
                    print(f"Error {response.status_code} while fetching trends data, retrying {retry + 1}/{max_retries}")
            else:
                print(f"Exceeded maximum retry attempts ({max_retries}) while fetching trends data.")

        # change browser
        if not data_fetched and browser_retry < browser_switch_retries:
            time.sleep(5)
            current_browser_version_index = (current_browser_version_index + 1) % len(browser_versions)
            print(f"Switching browser version to {browser_versions[current_browser_version_index]} and retrying...")

    print(f"Exceeded maximum browser switch attempts ({browser_switch_retries}). Exiting...")
    return None

# Example
keywords = ["test"]
trends_data = fetch_trends_data(keywords)
print(trends_data)
mccoydj1 commented 11 months ago

Hi everyone, I'm not a developer, but I fiddled around with AI a bit and I composed this code that works better than pytrends and receives few 429 errors on a very large mass. I share it with you freely to help us together. The code must however be reviewed and brought into a "formal" manner.

import json
import urllib.parse
from datetime import datetime, timedelta
from curl_cffi import requests
import time

def build_payload(keywords, timeframe='now 7-d', geo='US'):
    token_payload = {
        'hl': 'en-US',
        'tz': '0',
        'req': {
            'comparisonItem': [{'keyword': keyword, 'time': timeframe, 'geo': geo} for keyword in keywords],
            'category': 0,
            'property': ''
        }
    }
    token_payload['req'] = json.dumps(token_payload['req'])
    return token_payload

def convert_to_desired_format(raw_data):
    trend_data = {}
    for entry in raw_data['default']['timelineData']:
        timestamp = int(entry['time'])
        date_time_str = datetime.utcfromtimestamp(timestamp).strftime('%Y-%m-%d %H:%M:%S')
        value = entry['value'][0]
        trend_data[date_time_str] = value
    return trend_data

# Cookies
def get_google_cookies(impersonate_version='chrome110'):
    with requests.Session() as session:
        session.get("https://www.google.com", impersonate=impersonate_version)
        return session.cookies

def fetch_trends_data(keywords, days_ago=7, geo='US', hl='en-US', max_retries=5, browser_version='chrome110', browser_switch_retries=2):
    browser_versions = ['chrome110', 'edge101', 'chrome107', 'chrome104', 'chrome100', 'chrome101', 'chrome99']
    current_browser_version_index = browser_versions.index(browser_version)
    cookies = get_google_cookies(impersonate_version=browser_versions[current_browser_version_index])

    for browser_retry in range(browser_switch_retries + 1):
        data_fetched = False  # Reset data_fetched to False at the beginning of each browser_retry
        with requests.Session() as s:
            # phase 1: token
            for retry in range(max_retries):
                time.sleep(2)
                token_payload = build_payload(keywords)
                url = 'https://trends.google.com/trends/api/explore'
                params = urllib.parse.urlencode(token_payload)
                full_url = f"{url}?{params}"
                response = s.get(full_url, impersonate=browser_versions[current_browser_version_index], cookies=cookies)
                if response.status_code == 200:
                    content = response.text[4:]
                    try:
                        data = json.loads(content)
                        widgets = data['widgets']
                        tokens = {}
                        request = {}
                        for widget in widgets:
                            if widget['id'] == 'TIMESERIES':
                                tokens['timeseries'] = widget['token']
                                request['timeseries'] = widget['request']
                        break  # Break out of the retry loop as we got the token
                    except json.JSONDecodeError:
                        print(f"Failed to decode JSON while fetching token, retrying {retry + 1}/{max_retries}")
                else:
                    print(f"Error {response.status_code} while fetching token, retrying {retry + 1}/{max_retries}")
            else:
                print(f"Exceeded maximum retry attempts ({max_retries}) while fetching token. Exiting...")
                return None

            # phase 2: trends data
            for retry in range(max_retries):
                time.sleep(5)
                req_string = json.dumps(request['timeseries'], separators=(',', ':'))
                encoded_req = urllib.parse.quote(req_string, safe=':,+')
                url = f"https://trends.google.com/trends/api/widgetdata/multiline?hl={hl}&tz=0&req={encoded_req}&token={tokens['timeseries']}&tz=0"
                response = s.get(url, impersonate=browser_versions[current_browser_version_index], cookies=cookies)
                if response.status_code == 200:
                    content = response.text[5:]
                    try:
                        raw_data = json.loads(content)
                        # Convert raw data
                        trend_data = convert_to_desired_format(raw_data)
                        data_fetched = True  # Set data_fetched to True as we have successfully fetched the trend data
                        return trend_data
                    except json.JSONDecodeError:
                        print(f"Failed to decode JSON while fetching trends data, retrying {retry + 1}/{max_retries}")
                else:
                    print(f"Error {response.status_code} while fetching trends data, retrying {retry + 1}/{max_retries}")
            else:
                print(f"Exceeded maximum retry attempts ({max_retries}) while fetching trends data.")

        # change browser
        if not data_fetched and browser_retry < browser_switch_retries:
            time.sleep(5)
            current_browser_version_index = (current_browser_version_index + 1) % len(browser_versions)
            print(f"Switching browser version to {browser_versions[current_browser_version_index]} and retrying...")

    print(f"Exceeded maximum browser switch attempts ({browser_switch_retries}). Exiting...")
    return None

# Example
keywords = ["test"]
trends_data = fetch_trends_data(keywords)
print(trends_data)

Mmmmm... I think you're a developer now :)

praburamWAPKA commented 11 months ago

Hi everyone, I'm not a developer, but I fiddled around with AI a bit and I composed this code that works better than pytrends and receives few 429 errors on a very large mass. I share it with you freely to help us together. The code must however be reviewed and brought into a "formal" manner.

import json
import urllib.parse
from datetime import datetime, timedelta
from curl_cffi import requests
import time

def build_payload(keywords, timeframe='now 7-d', geo='US'):
    token_payload = {
        'hl': 'en-US',
        'tz': '0',
        'req': {
            'comparisonItem': [{'keyword': keyword, 'time': timeframe, 'geo': geo} for keyword in keywords],
            'category': 0,
            'property': ''
        }
    }
    token_payload['req'] = json.dumps(token_payload['req'])
    return token_payload

def convert_to_desired_format(raw_data):
    trend_data = {}
    for entry in raw_data['default']['timelineData']:
        timestamp = int(entry['time'])
        date_time_str = datetime.utcfromtimestamp(timestamp).strftime('%Y-%m-%d %H:%M:%S')
        value = entry['value'][0]
        trend_data[date_time_str] = value
    return trend_data

# Cookies
def get_google_cookies(impersonate_version='chrome110'):
    with requests.Session() as session:
        session.get("https://www.google.com", impersonate=impersonate_version)
        return session.cookies

def fetch_trends_data(keywords, days_ago=7, geo='US', hl='en-US', max_retries=5, browser_version='chrome110', browser_switch_retries=2):
    browser_versions = ['chrome110', 'edge101', 'chrome107', 'chrome104', 'chrome100', 'chrome101', 'chrome99']
    current_browser_version_index = browser_versions.index(browser_version)
    cookies = get_google_cookies(impersonate_version=browser_versions[current_browser_version_index])

    for browser_retry in range(browser_switch_retries + 1):
        data_fetched = False  # Reset data_fetched to False at the beginning of each browser_retry
        with requests.Session() as s:
            # phase 1: token
            for retry in range(max_retries):
                time.sleep(2)
                token_payload = build_payload(keywords)
                url = 'https://trends.google.com/trends/api/explore'
                params = urllib.parse.urlencode(token_payload)
                full_url = f"{url}?{params}"
                response = s.get(full_url, impersonate=browser_versions[current_browser_version_index], cookies=cookies)
                if response.status_code == 200:
                    content = response.text[4:]
                    try:
                        data = json.loads(content)
                        widgets = data['widgets']
                        tokens = {}
                        request = {}
                        for widget in widgets:
                            if widget['id'] == 'TIMESERIES':
                                tokens['timeseries'] = widget['token']
                                request['timeseries'] = widget['request']
                        break  # Break out of the retry loop as we got the token
                    except json.JSONDecodeError:
                        print(f"Failed to decode JSON while fetching token, retrying {retry + 1}/{max_retries}")
                else:
                    print(f"Error {response.status_code} while fetching token, retrying {retry + 1}/{max_retries}")
            else:
                print(f"Exceeded maximum retry attempts ({max_retries}) while fetching token. Exiting...")
                return None

            # phase 2: trends data
            for retry in range(max_retries):
                time.sleep(5)
                req_string = json.dumps(request['timeseries'], separators=(',', ':'))
                encoded_req = urllib.parse.quote(req_string, safe=':,+')
                url = f"https://trends.google.com/trends/api/widgetdata/multiline?hl={hl}&tz=0&req={encoded_req}&token={tokens['timeseries']}&tz=0"
                response = s.get(url, impersonate=browser_versions[current_browser_version_index], cookies=cookies)
                if response.status_code == 200:
                    content = response.text[5:]
                    try:
                        raw_data = json.loads(content)
                        # Convert raw data
                        trend_data = convert_to_desired_format(raw_data)
                        data_fetched = True  # Set data_fetched to True as we have successfully fetched the trend data
                        return trend_data
                    except json.JSONDecodeError:
                        print(f"Failed to decode JSON while fetching trends data, retrying {retry + 1}/{max_retries}")
                else:
                    print(f"Error {response.status_code} while fetching trends data, retrying {retry + 1}/{max_retries}")
            else:
                print(f"Exceeded maximum retry attempts ({max_retries}) while fetching trends data.")

        # change browser
        if not data_fetched and browser_retry < browser_switch_retries:
            time.sleep(5)
            current_browser_version_index = (current_browser_version_index + 1) % len(browser_versions)
            print(f"Switching browser version to {browser_versions[current_browser_version_index]} and retrying...")

    print(f"Exceeded maximum browser switch attempts ({browser_switch_retries}). Exiting...")
    return None

# Example
keywords = ["test"]
trends_data = fetch_trends_data(keywords)
print(trends_data)

Can you tell me how to get related queries

zhajingwen commented 10 months ago

Hi everyone, I'm not a developer, but I fiddled around with AI a bit and I composed this code that works better than pytrends and receives few 429 errors on a very large mass. I share it with you freely to help us together. The code must however be reviewed and brought into a "formal" manner.

import json
import urllib.parse
from datetime import datetime, timedelta
from curl_cffi import requests
import time

def build_payload(keywords, timeframe='now 7-d', geo='US'):
    token_payload = {
        'hl': 'en-US',
        'tz': '0',
        'req': {
            'comparisonItem': [{'keyword': keyword, 'time': timeframe, 'geo': geo} for keyword in keywords],
            'category': 0,
            'property': ''
        }
    }
    token_payload['req'] = json.dumps(token_payload['req'])
    return token_payload

def convert_to_desired_format(raw_data):
    trend_data = {}
    for entry in raw_data['default']['timelineData']:
        timestamp = int(entry['time'])
        date_time_str = datetime.utcfromtimestamp(timestamp).strftime('%Y-%m-%d %H:%M:%S')
        value = entry['value'][0]
        trend_data[date_time_str] = value
    return trend_data

# Cookies
def get_google_cookies(impersonate_version='chrome110'):
    with requests.Session() as session:
        session.get("https://www.google.com", impersonate=impersonate_version)
        return session.cookies

def fetch_trends_data(keywords, days_ago=7, geo='US', hl='en-US', max_retries=5, browser_version='chrome110', browser_switch_retries=2):
    browser_versions = ['chrome110', 'edge101', 'chrome107', 'chrome104', 'chrome100', 'chrome101', 'chrome99']
    current_browser_version_index = browser_versions.index(browser_version)
    cookies = get_google_cookies(impersonate_version=browser_versions[current_browser_version_index])

    for browser_retry in range(browser_switch_retries + 1):
        data_fetched = False  # Reset data_fetched to False at the beginning of each browser_retry
        with requests.Session() as s:
            # phase 1: token
            for retry in range(max_retries):
                time.sleep(2)
                token_payload = build_payload(keywords)
                url = 'https://trends.google.com/trends/api/explore'
                params = urllib.parse.urlencode(token_payload)
                full_url = f"{url}?{params}"
                response = s.get(full_url, impersonate=browser_versions[current_browser_version_index], cookies=cookies)
                if response.status_code == 200:
                    content = response.text[4:]
                    try:
                        data = json.loads(content)
                        widgets = data['widgets']
                        tokens = {}
                        request = {}
                        for widget in widgets:
                            if widget['id'] == 'TIMESERIES':
                                tokens['timeseries'] = widget['token']
                                request['timeseries'] = widget['request']
                        break  # Break out of the retry loop as we got the token
                    except json.JSONDecodeError:
                        print(f"Failed to decode JSON while fetching token, retrying {retry + 1}/{max_retries}")
                else:
                    print(f"Error {response.status_code} while fetching token, retrying {retry + 1}/{max_retries}")
            else:
                print(f"Exceeded maximum retry attempts ({max_retries}) while fetching token. Exiting...")
                return None

            # phase 2: trends data
            for retry in range(max_retries):
                time.sleep(5)
                req_string = json.dumps(request['timeseries'], separators=(',', ':'))
                encoded_req = urllib.parse.quote(req_string, safe=':,+')
                url = f"https://trends.google.com/trends/api/widgetdata/multiline?hl={hl}&tz=0&req={encoded_req}&token={tokens['timeseries']}&tz=0"
                response = s.get(url, impersonate=browser_versions[current_browser_version_index], cookies=cookies)
                if response.status_code == 200:
                    content = response.text[5:]
                    try:
                        raw_data = json.loads(content)
                        # Convert raw data
                        trend_data = convert_to_desired_format(raw_data)
                        data_fetched = True  # Set data_fetched to True as we have successfully fetched the trend data
                        return trend_data
                    except json.JSONDecodeError:
                        print(f"Failed to decode JSON while fetching trends data, retrying {retry + 1}/{max_retries}")
                else:
                    print(f"Error {response.status_code} while fetching trends data, retrying {retry + 1}/{max_retries}")
            else:
                print(f"Exceeded maximum retry attempts ({max_retries}) while fetching trends data.")

        # change browser
        if not data_fetched and browser_retry < browser_switch_retries:
            time.sleep(5)
            current_browser_version_index = (current_browser_version_index + 1) % len(browser_versions)
            print(f"Switching browser version to {browser_versions[current_browser_version_index]} and retrying...")

    print(f"Exceeded maximum browser switch attempts ({browser_switch_retries}). Exiting...")
    return None

# Example
keywords = ["test"]
trends_data = fetch_trends_data(keywords)
print(trends_data)

Don't working

Error 429 while fetching trends data, retrying 1/5 Error 429 while fetching trends data, retrying 2/5 Error 429 while fetching trends data, retrying 3/5 Error 429 while fetching trends data, retrying 4/5 Error 429 while fetching trends data, retrying 5/5 Exceeded maximum retry attempts (5) while fetching trends data. Switching browser version to edge101 and retrying... Error 429 while fetching trends data, retrying 1/5 Error 429 while fetching trends data, retrying 2/5 Error 429 while fetching trends data, retrying 3/5 Error 429 while fetching trends data, retrying 4/5 Error 429 while fetching trends data, retrying 5/5 Exceeded maximum retry attempts (5) while fetching trends data. Switching browser version to chrome107 and retrying... Error 429 while fetching trends data, retrying 1/5 Error 429 while fetching trends data, retrying 2/5 Error 429 while fetching trends data, retrying 3/5 Error 429 while fetching trends data, retrying 4/5 Error 429 while fetching trends data, retrying 5/5 Exceeded maximum retry attempts (5) while fetching trends data. Exceeded maximum browser switch attempts (2). Exiting... None

praburamWAPKA commented 10 months ago

Hi everyone, I'm not a developer, but I fiddled around with AI a bit and I composed this code that works better than pytrends and receives few 429 errors on a very large mass. I share it with you freely to help us together. The code must however be reviewed and brought into a "formal" manner.

import json
import urllib.parse
from datetime import datetime, timedelta
from curl_cffi import requests
import time

def build_payload(keywords, timeframe='now 7-d', geo='US'):
    token_payload = {
        'hl': 'en-US',
        'tz': '0',
        'req': {
            'comparisonItem': [{'keyword': keyword, 'time': timeframe, 'geo': geo} for keyword in keywords],
            'category': 0,
            'property': ''
        }
    }
    token_payload['req'] = json.dumps(token_payload['req'])
    return token_payload

def convert_to_desired_format(raw_data):
    trend_data = {}
    for entry in raw_data['default']['timelineData']:
        timestamp = int(entry['time'])
        date_time_str = datetime.utcfromtimestamp(timestamp).strftime('%Y-%m-%d %H:%M:%S')
        value = entry['value'][0]
        trend_data[date_time_str] = value
    return trend_data

# Cookies
def get_google_cookies(impersonate_version='chrome110'):
    with requests.Session() as session:
        session.get("https://www.google.com", impersonate=impersonate_version)
        return session.cookies

def fetch_trends_data(keywords, days_ago=7, geo='US', hl='en-US', max_retries=5, browser_version='chrome110', browser_switch_retries=2):
    browser_versions = ['chrome110', 'edge101', 'chrome107', 'chrome104', 'chrome100', 'chrome101', 'chrome99']
    current_browser_version_index = browser_versions.index(browser_version)
    cookies = get_google_cookies(impersonate_version=browser_versions[current_browser_version_index])

    for browser_retry in range(browser_switch_retries + 1):
        data_fetched = False  # Reset data_fetched to False at the beginning of each browser_retry
        with requests.Session() as s:
            # phase 1: token
            for retry in range(max_retries):
                time.sleep(2)
                token_payload = build_payload(keywords)
                url = 'https://trends.google.com/trends/api/explore'
                params = urllib.parse.urlencode(token_payload)
                full_url = f"{url}?{params}"
                response = s.get(full_url, impersonate=browser_versions[current_browser_version_index], cookies=cookies)
                if response.status_code == 200:
                    content = response.text[4:]
                    try:
                        data = json.loads(content)
                        widgets = data['widgets']
                        tokens = {}
                        request = {}
                        for widget in widgets:
                            if widget['id'] == 'TIMESERIES':
                                tokens['timeseries'] = widget['token']
                                request['timeseries'] = widget['request']
                        break  # Break out of the retry loop as we got the token
                    except json.JSONDecodeError:
                        print(f"Failed to decode JSON while fetching token, retrying {retry + 1}/{max_retries}")
                else:
                    print(f"Error {response.status_code} while fetching token, retrying {retry + 1}/{max_retries}")
            else:
                print(f"Exceeded maximum retry attempts ({max_retries}) while fetching token. Exiting...")
                return None

            # phase 2: trends data
            for retry in range(max_retries):
                time.sleep(5)
                req_string = json.dumps(request['timeseries'], separators=(',', ':'))
                encoded_req = urllib.parse.quote(req_string, safe=':,+')
                url = f"https://trends.google.com/trends/api/widgetdata/multiline?hl={hl}&tz=0&req={encoded_req}&token={tokens['timeseries']}&tz=0"
                response = s.get(url, impersonate=browser_versions[current_browser_version_index], cookies=cookies)
                if response.status_code == 200:
                    content = response.text[5:]
                    try:
                        raw_data = json.loads(content)
                        # Convert raw data
                        trend_data = convert_to_desired_format(raw_data)
                        data_fetched = True  # Set data_fetched to True as we have successfully fetched the trend data
                        return trend_data
                    except json.JSONDecodeError:
                        print(f"Failed to decode JSON while fetching trends data, retrying {retry + 1}/{max_retries}")
                else:
                    print(f"Error {response.status_code} while fetching trends data, retrying {retry + 1}/{max_retries}")
            else:
                print(f"Exceeded maximum retry attempts ({max_retries}) while fetching trends data.")

        # change browser
        if not data_fetched and browser_retry < browser_switch_retries:
            time.sleep(5)
            current_browser_version_index = (current_browser_version_index + 1) % len(browser_versions)
            print(f"Switching browser version to {browser_versions[current_browser_version_index]} and retrying...")

    print(f"Exceeded maximum browser switch attempts ({browser_switch_retries}). Exiting...")
    return None

# Example
keywords = ["test"]
trends_data = fetch_trends_data(keywords)
print(trends_data)

Don't working

Error 429 while fetching trends data, retrying 1/5 Error 429 while fetching trends data, retrying 2/5 Error 429 while fetching trends data, retrying 3/5 Error 429 while fetching trends data, retrying 4/5 Error 429 while fetching trends data, retrying 5/5 Exceeded maximum retry attempts (5) while fetching trends data. Switching browser version to edge101 and retrying... Error 429 while fetching trends data, retrying 1/5 Error 429 while fetching trends data, retrying 2/5 Error 429 while fetching trends data, retrying 3/5 Error 429 while fetching trends data, retrying 4/5 Error 429 while fetching trends data, retrying 5/5 Exceeded maximum retry attempts (5) while fetching trends data. Switching browser version to chrome107 and retrying... Error 429 while fetching trends data, retrying 1/5 Error 429 while fetching trends data, retrying 2/5 Error 429 while fetching trends data, retrying 3/5 Error 429 while fetching trends data, retrying 4/5 Error 429 while fetching trends data, retrying 5/5 Exceeded maximum retry attempts (5) while fetching trends data. Exceeded maximum browser switch attempts (2). Exiting... None

Your IP may be blocked and Proxy is expensive so I would suggest to Install Termux App and install Ubuntu on it. To change IP once you got 429. Turn off and on mobile data or turn off and on Aeroplane mode to change IP address

ImNotAProCoder commented 10 months ago

My script seems to be working half the time now, which is weird, no new pytrends update.

CyberTamilan commented 10 months ago

For Rising and Top Queries

import json
import urllib.parse
from datetime import datetime, timedelta
from curl_cffi import requests
import time
import os

def build_payload(keywords, timeframe='now 1-H', geo=''):
    token_payload = {
        'hl': 'en-US',
        'tz': '0',
        'req': {
            'comparisonItem': [{'keyword': keyword, 'time': timeframe, 'geo': geo} for keyword in keywords],
            'category': 0,
            'property': ''
        }
    }
    token_payload['req'] = json.dumps(token_payload['req'])
    return token_payload

def convert_to_desired_format(raw_data):
    trend_data = {'TOP': {}, 'RISING': {}}

    if 'rankedList' in raw_data.get('default', {}):
        for item in raw_data['default']['rankedList']:
            for entry in item.get('rankedKeyword', []):
                query = entry.get('query')
                value = entry.get('value')
                if query and value:
                    link = entry.get('link', '')
                    trend_type = link.split('=')[-1].split('&')[0].upper() if link else None

                    if trend_type in ['TOP', 'RISING']:
                        trend_data[trend_type][query] = value
    return trend_data

def get_google_cookies(impersonate_version='chrome110'):
    with requests.Session() as session:
        session.get("https://www.google.com", impersonate=impersonate_version)
        return session.cookies

def fetch_trends_data(keywords, days_ago=7, geo='US', hl='en-US', max_retries=5, browser_version='chrome110', browser_switch_retries=2):
    browser_versions = ['chrome110', 'edge101', 'chrome107', 'chrome104', 'chrome100', 'chrome101', 'chrome99']
    current_browser_version_index = browser_versions.index(browser_version)
    cookies = get_google_cookies(impersonate_version=browser_versions[current_browser_version_index])

    for browser_retry in range(browser_switch_retries + 1):
        data_fetched = False
        with requests.Session() as s:
            # phase 1: token
            for retry in range(max_retries):
                time.sleep(2)
                token_payload = build_payload(keywords)
                url = 'https://trends.google.com/trends/api/explore'
                params = urllib.parse.urlencode(token_payload)
                full_url = f"{url}?{params}"
                response = s.get(full_url, impersonate=browser_versions[current_browser_version_index], cookies=cookies)
                if response.status_code == 200:
                    content = response.text[4:]
                    try:
                        data = json.loads(content)
                        widgets = data['widgets']
                        tokens = {}
                        request = {}
                        for widget in widgets:
                            if widget['id'] == 'RELATED_QUERIES':
                                tokens['related_queries'] = widget['token']
                                request['related_queries'] = widget['request']
                        break
                    except json.JSONDecodeError:
                        print(f"Failed to decode JSON while fetching token, retrying {retry + 1}/{max_retries}")
                else:
                    print(f"Error {response.status_code} while fetching token, retrying {retry + 1}/{max_retries}")
            else:
                print(f"Exceeded maximum retry attempts ({max_retries}) while fetching token. Exiting...")
                return None

            # phase 2: trends data
            for retry in range(max_retries):
                time.sleep(5)
                req_string = json.dumps(request['related_queries'], separators=(',', ':'))
                encoded_req = urllib.parse.quote(req_string, safe=':,+')
                url = f"https://trends.google.com/trends/api/widgetdata/relatedsearches?hl={hl}&tz=0&req={encoded_req}&token={tokens['related_queries']}&tz=0"
                response = s.get(url, impersonate=browser_versions[current_browser_version_index], cookies=cookies)
                print(f"URL: {url}")
                if response.status_code == 200:
                    content = response.text[5:]
                    try:
                        file_name = f"trends_data_{os.getpid()}.json"
                        with open(file_name, 'w') as json_file:
                            json_file.write(content)

                        # Remove first line from the file
                        with open(file_name, 'r') as f:
                            lines = f.readlines()[1:]
                        with open(file_name, 'w') as f:
                            f.writelines(lines)

                        # Load JSON content from the file
                        with open(file_name, 'r') as json_file:
                            data = json.load(json_file)

                        # Extract and print queries and values from both rankedLists separately
                        for item in data['default']['rankedList'][0]['rankedKeyword']:
                            print(f"Top: {item['query']}, Value: {item['value']}")

                        for item in data['default']['rankedList'][1]['rankedKeyword']:
                            print(f"Rising: {item['query']}, Value: {item['value']}")

                        return content
                    except json.JSONDecodeError:
                        print(f"Failed to decode JSON while fetching trends data, retrying {retry + 1}/{max_retries}")
                else:
                    print(f"Error {response.status_code} while fetching trends data, retrying {retry + 1}/{max_retries}")
            else:
                print(f"Exceeded maximum retry attempts ({max_retries}) while fetching trends data.")

        if not data_fetched and browser_retry < browser_switch_retries:
            time.sleep(5)
            current_browser_version_index = (current_browser_version_index + 1) % len(browser_versions)
            print(f"Switching browser version to {browser_versions[current_browser_version_index]} and retrying...")

    print(f"Exceeded maximum browser switch attempts ({browser_switch_retries}). Exiting...")
    return None

# Example
keywords = ["test"]
trends_data = fetch_trends_data(keywords)
print(trends_data)
Raidus commented 10 months ago

@CyberTamilan thanks for providing the code. Didn't know the repo curl_cffi which I will definitely try out in future.

Your code is currently not working for me. Even with impersionate. But all other code I am running for gtrends is not working too. Gtrends is down again. If the iframe code is not working then all crawlers will have issues

image

praburamWAPKA commented 10 months ago

@CyberTamilan thanks for providing the code. Didn't know the repo curl_cffi which I will definitely try out in future.

Your code is currently not working for me. Even with impersionate. But all other code I am running for gtrends is not working too. Gtrends is down again. If the iframe code is not working then all crawlers will have issues

image

Change the category value and also try to use data alongwith timestamp

Helldez commented 10 months ago

I repeat, I am not a professional developer, but I developed this code which directly takes the csv from the trends and then prints times and values. The problem is that it is a very heavy code to run (on pythonanywhere it takes a lot of CPU). Help me develop it to make it more usable.

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
import http.cookies
import pandas as pd
import urllib.parse
import os
import json
import time
from curl_cffi import requests as cffi_requests

MAX_RETRIES = 5

def trend_selenium(keywords):
    browser_versions = ["chrome99", "chrome100", "chrome101", "chrome104", "chrome107", "chrome110"]

    chrome_options = Options()
    chrome_options.add_argument("--headless")
    chrome_options.add_argument("--no-sandbox")
    chrome_options.add_argument("--disable-dev-shm-usage")
    chrome_options.add_argument("--window-size=1920,1080")
    chrome_options.add_argument("--user-data-dir=./user_data")

    driver = webdriver.Chrome(options=chrome_options)

    encoded_keywords = urllib.parse.quote_plus(keywords)

    retries = 0
    file_downloaded = False
    while retries < MAX_RETRIES and not file_downloaded:
        response = cffi_requests.get("https://www.google.com", impersonate=browser_versions[retries % len(browser_versions)])
        cookies = response.cookies
        for cookie in cookies:
            cookie_str = str(cookie)
            cookie_dict = http.cookies.SimpleCookie(cookie_str)
            for key, morsel in cookie_dict.items():
                selenium_cookie = {
                    'name': key,
                    'value': morsel.value,
                    'domain': cookie.domain
                }
                driver.add_cookie(selenium_cookie)

        trends_url = f'https://trends.google.com/trends/explore?date=now%207-d&geo=US&q={encoded_keywords}'
        print(trends_url)
        driver.get(trends_url)

        excel_button_selector = "body > div.trends-wrapper > div:nth-child(2) > div > md-content > div > div > div:nth-child(1) > trends-widget > ng-include > widget > div > div > div > widget-actions > div > button.widget-actions-item.export > i"

        try:
            WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.CSS_SELECTOR, excel_button_selector)))
            driver.find_element(By.CSS_SELECTOR, excel_button_selector).click()
            time.sleep(5)  # Aggiungi una pausa per attendere il download

            if os.path.exists('multiTimeline.csv'):
                file_downloaded = True
            else:
                print(f"File not downloaded. Attempt {retries + 1} of {MAX_RETRIES}...")
                retries += 1
                time.sleep(retries)  # Implementa un ritardo esponenziale
                driver.refresh()

        except Exception as e:
            print(f"Error during download attempt: {str(e)}")
            retries += 1
            time.sleep(retries)  # Implementa un ritardo esponenziale

    trend_data = {}
    if file_downloaded:
        try:
            trend_df = pd.read_csv('multiTimeline.csv', skiprows=2)
            trend_df['Time'] = pd.to_datetime(trend_df['Time']).dt.strftime('%Y-%m-%d %H:%M:%S')
            data_column = [col for col in trend_df.columns if col not in ['Time']][0]
            trend_data = dict(zip(trend_df['Time'], trend_df[data_column]))
            os.remove('multiTimeline.csv')
            trends_str = json.dumps(trend_data)
        except Exception as e:
            print(f"Error in reading or deleting the file 'multiTimeline.csv': {str(e)}")
    else:
        print("File not downloaded after the maximum number of attempts.")

    driver.quit()
    return trends_str

keywords = "test"
trends_str = trend_selenium(keywords)
print(trends_str)
dhruv-1010 commented 10 months ago

I repeat, I am not a professional developer, but I developed this code which directly takes the csv from the trends and then prints times and values. The problem is that it is a very heavy code to run (on pythonanywhere it takes a lot of CPU). Help me develop it to make it more usable.

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
import http.cookies
import pandas as pd
import urllib.parse
import os
import json
import time
from curl_cffi import requests as cffi_requests

MAX_RETRIES = 5

def trend_selenium(keywords):
    browser_versions = ["chrome99", "chrome100", "chrome101", "chrome104", "chrome107", "chrome110"]

    chrome_options = Options()
    chrome_options.add_argument("--headless")
    chrome_options.add_argument("--no-sandbox")
    chrome_options.add_argument("--disable-dev-shm-usage")
    chrome_options.add_argument("--window-size=1920,1080")
    chrome_options.add_argument("--user-data-dir=./user_data")

    driver = webdriver.Chrome(options=chrome_options)

    encoded_keywords = urllib.parse.quote_plus(keywords)

    retries = 0
    file_downloaded = False
    while retries < MAX_RETRIES and not file_downloaded:
        response = cffi_requests.get("https://www.google.com", impersonate=browser_versions[retries % len(browser_versions)])
        cookies = response.cookies
        for cookie in cookies:
            cookie_str = str(cookie)
            cookie_dict = http.cookies.SimpleCookie(cookie_str)
            for key, morsel in cookie_dict.items():
                selenium_cookie = {
                    'name': key,
                    'value': morsel.value,
                    'domain': cookie.domain
                }
                driver.add_cookie(selenium_cookie)

        trends_url = f'https://trends.google.com/trends/explore?date=now%207-d&geo=US&q={encoded_keywords}'
        print(trends_url)
        driver.get(trends_url)

        excel_button_selector = "body > div.trends-wrapper > div:nth-child(2) > div > md-content > div > div > div:nth-child(1) > trends-widget > ng-include > widget > div > div > div > widget-actions > div > button.widget-actions-item.export > i"

        try:
            WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.CSS_SELECTOR, excel_button_selector)))
            driver.find_element(By.CSS_SELECTOR, excel_button_selector).click()
            time.sleep(5)  # Aggiungi una pausa per attendere il download

            if os.path.exists('multiTimeline.csv'):
                file_downloaded = True
            else:
                print(f"File not downloaded. Attempt {retries + 1} of {MAX_RETRIES}...")
                retries += 1
                time.sleep(retries)  # Implementa un ritardo esponenziale
                driver.refresh()

        except Exception as e:
            print(f"Error during download attempt: {str(e)}")
            retries += 1
            time.sleep(retries)  # Implementa un ritardo esponenziale

    trend_data = {}
    if file_downloaded:
        try:
            trend_df = pd.read_csv('multiTimeline.csv', skiprows=2)
            trend_df['Time'] = pd.to_datetime(trend_df['Time']).dt.strftime('%Y-%m-%d %H:%M:%S')
            data_column = [col for col in trend_df.columns if col not in ['Time']][0]
            trend_data = dict(zip(trend_df['Time'], trend_df[data_column]))
            os.remove('multiTimeline.csv')
            trends_str = json.dumps(trend_data)
        except Exception as e:
            print(f"Error in reading or deleting the file 'multiTimeline.csv': {str(e)}")
    else:
        print("File not downloaded after the maximum number of attempts.")

    driver.quit()
    return trends_str

keywords = "test"
trends_str = trend_selenium(keywords)
print(trends_str)

thanks its working with some modifications use cases can be satisified !! upvoted

francksa commented 10 months ago

GPT suggests refactoring to make it less CPU intensive

from selenium import webdriver from selenium.webdriver.chrome.options import Options from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC import http.cookies import pandas as pd import urllib.parse import os import json from curl_cffi import requests as cffi_requests

MAX_RETRIES = 5

def create_chrome_driver(): chrome_options = Options() chrome_options.add_argument("--headless") chrome_options.add_argument("--no-sandbox") chrome_options.add_argument("--disable-dev-shm-usage") chrome_options.add_argument("--window-size=1920,1080") chrome_options.add_argument("--user-data-dir=./user_data")

return webdriver.Chrome(options=chrome_options)

def get_trends_url(keywords): encoded_keywords = urllib.parse.quote_plus(keywords) return f'https://trends.google.com/trends/explore?date=now%207-d&geo=US&q={encoded_keywords}'

def download_trends_data(driver, trends_url): driver.get(trends_url) excel_button_selector = "body > div.trends-wrapper > div:nth-child(2) > div > md-content > div > div > div:nth-child(1) > trends-widget > ng-include > widget > div > div > div > widget-actions > div > button.widget-actions-item.export > i"

try:
    WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.CSS_SELECTOR, excel_button_selector)))
    driver.find_element(By.CSS_SELECTOR, excel_button_selector).click()
    WebDriverWait(driver, 30).until(lambda x: os.path.exists('multiTimeline.csv'))
    return True
except Exception as e:
    print(f"Error during download attempt: {str(e)}")
    return False

def read_and_delete_csv(): try: trend_df = pd.read_csv('multiTimeline.csv', skiprows=2) trend_df['Time'] = pd.to_datetime(trend_df['Time']).dt.strftime('%Y-%m-%d %H:%M:%S') data_column = [col for col in trend_df.columns if col not in ['Time']][0] trend_data = dict(zip(trend_df['Time'], trend_df[data_column])) os.remove('multiTimeline.csv') return json.dumps(trend_data) except Exception as e: print(f"Error in reading or deleting the file 'multiTimeline.csv': {str(e)}") return None

def trend_selenium(keywords): browser_versions = ["chrome99", "chrome100", "chrome101", "chrome104", "chrome107", "chrome110"] driver = create_chrome_driver() trends_url = get_trends_url(keywords)

retries = 0
while retries < MAX_RETRIES:
    response = cffi_requests.get("https://www.google.com", impersonate=browser_versions[retries % len(browser_versions)])
    cookies = response.cookies
    for cookie in cookies:
        cookie_str = str(cookie)
        cookie_dict = http.cookies.SimpleCookie(cookie_str)
        for key, morsel in cookie_dict.items():
            selenium_cookie = {
                'name': key,
                'value': morsel.value,
                'domain': cookie.domain
            }
            driver.add_cookie(selenium_cookie)

    if download_trends_data(driver, trends_url):
        trends_str = read_and_delete_csv()
        if trends_str:
            driver.quit()
            return trends_str

    retries += 1

driver.quit()
print("File not downloaded after the maximum number of attempts.")
return None

keywords = "test" trends_str = trend_selenium(keywords) print(trends_str)

dev-est commented 10 months ago

One workaround I've had work the last few days is using the retry function built into the TrendReq call:

TrendReq(retries=10,backoff_factor=0.1)

this does require urllib3 < 2 as this is forced to update when the requests library is updated to the latest version.

thanhtoan1196 commented 10 months ago

One workaround I've had work the last few days is using the retry function built into the TrendReq call:

TrendReq(retries=10,backoff_factor=0.1)

this does require urllib3 < 2 as this is forced to update when the requests library is updated to the latest version.

retry doesn't work for me

papageorgiou commented 10 months ago

This time its even for a longer period not working. Last time the issue disappeared after a few days. It has been already two weeks since this issues are piling up.

image Orange = Sucess, Blue = Failed

It seems that after approximately 2 weeks of turbulence it's again possible to get responses back from GT at a decent success rate. I say two weeks because that was my starting point, with my target being ~ 1K search terms for interest over time. Thank you @Raidus for sharing this graph, it helped me to develop some intuition about the situation.