GeneralMills / pytrends

Pseudo API for Google Trends
Other
3.12k stars 798 forks source link

500 internal server errors #586

Closed hussieneloy closed 11 months ago

hussieneloy commented 12 months ago

Hello, I am using pytrends with its newest version 4.9.1 but since yesterday, I am getting too many 500 status code internal server errors responses. The error is happening with related_queries and related_topics methods. It is not the case as it was few months ago with Google update when we were getting 429 status code errors that indicates too many requests. I tried it from different machines with different IPs but it remains the same. Has this problem occurred to anybody else? I haven't seen other issues opened yet. Is is a temporary problem with Google servers or is it something new with their interface that causes such issues and so it would require an update to the scraping method?

jdeverdun commented 12 months ago

it is the same for all the tools like this, with past days request (like now 1-d) :/

carinadourado commented 12 months ago

With me it also happened and even using a simple code, without many requests, the error appears "The request failed: Google returned a response with code 500".

dtaubaso commented 12 months ago

Same here, it's been happening in the last few days...

alessandropicca91 commented 12 months ago

Hey, same problem here when trying to retrieve recent data (past week), and it has been happening since yesterday

BioMikeUkr commented 12 months ago

The same problem. The problem remained when using a proxy.

felicianomariom commented 12 months ago

Hi, has anyone figured out the solution? Thanks

pitzmoni commented 12 months ago

Same here :(

danytop commented 12 months ago

same problem...

jamesbwilson commented 12 months ago

Also looking for a solution..

techblogger5 commented 12 months ago

Same happening with me

cpaulsanders commented 12 months ago

I have the same problem. Also with the npm package

Terseus commented 12 months ago

This looks like a problem in the Google Trends backend, if that's the case we can do nothing about it.

Someone already traced that the website didn't really changed its format, the only difference is the dreaded USER_TYPE_SCRAPER vs USER_TYPE_LEGIT_USER: https://github.com/PMassicotte/gtrendsR/issues/451#issuecomment-1621634700

Let's hope that it's just a problem in the Trends backend and not more throttling to scrapers.

ilyazub commented 12 months ago

It's not enough to have a valid NID cookie now. Google Trends now expects to submit POST request to https://trends.google.com/trends/api/explore with the reCAPTCHA token for the specific search parameters. After that, related_queries and related_topics are successfully retrieved.

Actual network request

image

The same issue did happen a months ago (ref).

image

Here's a part of the Google Trends JS code. (Changing window.enableRecaptcha and this.enableRecaptcha_ to false results into blocked requests in the browser.)

d.getExploreReport = function(a) {
  var b = this
    , c = {
    req: JSON.stringify(a),
    tz: this.configService_.userTimezoneOffset
  };
  return this.enableRecaptcha_ ? this.recaptchaService_.loadRecaptchaToken().then(function(e) {
    return b.http_.post(b.apiPathPrefix_, encodeURIComponent(e), {
      params: c
    })
  }) : this.http_.get(this.apiPathPrefix_, {
    params: c
  })
};
pepi99 commented 12 months ago

Same problem here.

As @Terseus mentioned, let's just hope this is an actual bug with Trends API and not a prevention for scrapers.

Otherwise, some Selenium logic would be very useful to solve this problem.

felicianomariom commented 12 months ago

Same problem here.

As @Terseus mentioned, let's just hope this is an actual bug with Trends API and not prevention for scrapers.

Otherwise, some Selenium logic would be very useful to solve this problem.

@pepi99 , I tried to using Selenium, but I keep getting code 429 error when the webpage from google trends auto load. Did you get it to work.

pepi99 commented 12 months ago

@felicianomariom, Nope, I am just starting to work on this now. Will keep you updated.

pepi99 commented 12 months ago

@felicianomariom Have a look at this code:

https://github.com/Aman7818/Google-Trends/blob/main/Api/views.py

Based on that their code, you can add an additional click on the download button to download the data locally, then you can just load it with pandas and do whatever you want with it, here is the script to download the CSV:

import time
import random
from selenium.webdriver.common.by import By
import undetected_chromedriver as uc

search_term = 'term1, term2'

options = uc.ChromeOptions()
options.add_argument("--disable-extensions")
options.add_argument('--headless')
options.add_argument(
    "--user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/113.0.5672.126 Safari/537.36")

driver = uc.Chrome(use_subprocess=True, options=options)

min_sleep_time = 1
max_sleep_time = 2

driver.get("https://trends.google.com/home?hl=en-US")
time.sleep(random.randint(min_sleep_time, max_sleep_time))

first_click = driver.find_element(By.CLASS_NAME, "VfPpkd-fmcmS-yrriRe.VfPpkd-fmcmS-yrriRe-OWXEXe-mWPk3d")
first_click.click()
first_click.send_keys(search_term)
time.sleep(random.randint(min_sleep_time, max_sleep_time))

explore_btn = driver.find_elements(By.CLASS_NAME, "UywwFc-LgbsSe.UywwFc-LgbsSe-OWXEXe-dgl2Hf.Qt4Qjb")
if len(explore_btn) > 0:
    explore_btn_1 = explore_btn[0].find_element(By.CLASS_NAME, "UywwFc-vQzf8d")
    explore_btn_1.click()

time.sleep(5)
duration_btn = driver.find_element(By.CSS_SELECTOR, "body > div.trends-wrapper > div:nth-child(2) > div > md-content > div > div > div:nth-child(1) > trends-widget > ng-include > widget > div > div > div > widget-actions > div > button.widget-actions-item.export")
duration_btn.click()

time.sleep(20)

driver.quit()

Make sure to also add additional logic for the region and the time. If you happen to do it before me, please paste your modified code here as well.

NOTE: Don't set the base URL to an already modified URL with search term/s, region and time, because if you are not going through the clicking process from the base URL, Google Trends will limit you after a couple of requests and you will get a 429 error.

zhajingwen commented 12 months ago

Same here

pytrends.exceptions.ResponseError: The request failed: Google returned a response with code 500
DeusData commented 12 months ago

+1

eddie-reyes commented 12 months ago

Started having problems yesterday and now every request responds with a server error.

moonjaehyun commented 12 months ago

I've been having this issue for about 5 days and all requests respond with server errors like above.

dtaubaso commented 12 months ago

@pepi99 I'm trying your code, it works great so far on local, but when deployed on Google Cloud Run it never draws the actual trendline. I don't have a clue of what's going on... This is a screenshot taken from the actual process, as you can see, I can get as far as to change the date and everything... I know that I'm crossing a line here, but maybe you have an idea of what might be going on... Thanks

https://storage.googleapis.com/nmd_img/envios/30079ac5-25a0-4b8f-944b-dac7f5454a94.png

Helldez commented 11 months ago

Any news?

send commented 11 months ago

It seems to work fine with timeframe more than 2 weeks ago like 2023-06-26T09 2023-06-27T09. But more recent timeframe does not.

RiiNagaja commented 11 months ago

@send I am running a collection from 2020 to 2020-06-15, but it doesn't work, so old timeframes are affected. You only have a day long window so I am curious why that makes it work for you. Conclusions from above seemed to say you needed to submit a new Post request. Maybe that isn't necessary for smaller data hauls?

pepi99 commented 11 months ago

https://storage.googleapis.com/nmd_img/envios/30079ac5-25a0-4b8f-944b-dac7f5454a94.png

@dtaubaso tbh, I am not sure what is the reason for that. I am running a modified version of the script I provided on a remote server and it seems to run just fine for now. If you provide me more details, I can try to help.

dtaubaso commented 11 months ago

@pepi99 don't worry, I'm using Playwright and it works fine, problem is that it takes too long to retrieve the information for each keyword, I hope someone finds a fix for the API soon...

send commented 11 months ago

@RiiNagaja I tried a request the timeframe 2020-01-01 2020-06-15 and was able to get daily data of that period. I don't know the reason, but it seems that the data that can be retrieved is limited by the format of timeframe.

Helldez commented 11 months ago

Hello, is it therefore confirmed that it is a transient problem of Google Trends? or should we think of another way to retrieve weekly and daily data? (example: using Selenium)

RiiNagaja commented 11 months ago

@RiiNagaja I tried a request the timeframe 2020-01-01 2020-06-15 and was able to get daily data of that period. I don't know the reason, but it seems that the data that can be retrieved is limited by the format of timeframe.

Right, I forgot to say that I run a script that automatically subdivides timeframes so that hourly data can be retrieved. So it is only that part that isn't working, how strange.

Helldez commented 11 months ago

Any news? It doesnt work with now 7-d timeframe

ccharisis commented 11 months ago

I just noticed that the response changed from 500 (Internal Server Error) to 429 (Too many requests) ....

nicktba commented 11 months ago

I just noticed that the response changed from 500 (Internal Server Error) to 429 (Too many requests) ....

Yep, I'm getting 429 errors on nearly every request now.

Anyone else?

carinadourado commented 11 months ago

Yes, same here...

Draptol commented 11 months ago

yeah the same in node.js lib (it has the same mechanism like this lib in python) :(.

ericbugin1 commented 11 months ago

same problem

Helldez commented 11 months ago

Same here

EonAndahalf commented 11 months ago

same, any news ?

kitsaravana commented 11 months ago

same issue

Helldez commented 11 months ago

3 weeks without pytrends😢

nicktba commented 11 months ago

Has anyone tried reaching out to Google?

Not in relation to the PyTrends library but rather that the API isn't working at all, including for the embedding widget.

If anyone wants to embed a Google Trends chart on their website or share it to socials it's missing data or just doesn't work at all.

This API issue is the cause for libraries such as PyTrends to return 429 & 500 errors, empty dataframes etc..

Screenshot 2023-07-25 at 3 43 02 PM Screenshot 2023-07-25 at 3 41 45 PM
ccharisis commented 11 months ago

I have reported this exact issue (embed feature) via gtrends site since 3 weeks ago, but no answer from Google yet..

nicktba commented 11 months ago

I have reported this exact issue (embed feature) via gtrends site since 3 weeks ago, but no answer from Google yet..

Very atypical of them to allow an issue to persist for so long. Wonder whats going on behind the scenes

dtaubaso commented 11 months ago

I suggest we start asking on twitter to dannysullivan and JohnMu about the embed problem, my guess is that this is the key behind this issue...

simonlim334 commented 11 months ago

@pepi99 I am unable to access the link: https://github.com/Aman7818/Google-Trends/blob/main/Api/views.py you have dropped. Is this anyway for it to be accessible to the public?

ccharisis commented 11 months ago

Good news!! The API is back to life! Google also solved the GTrends embed issue ..

dtaubaso commented 11 months ago

Seems to be working fine now, let's hope it lasts

hussieneloy commented 11 months ago

Ok. Lets hope it remains that way. I think keeping the issue open serves no purpose now.