Closed sundios closed 1 year ago
Even I'm also facing the same issue from 08/03/2023 : 5:00PM
Sam issue, who can tell me how to fix it?
Same Issue. when can this be fixed?
It seems to use grecapcha , for prevent crawling data... It's hard to resolve this problem
So I guess no easy work around this time?
Ok I might have found a work around. It appears that now, the first connection to google trends returns a 429, so if you setup the object with :
pytrend = TrendReq(retries=3)
It should work. I've tested it on my side and no more 429
I am still getting a 429 with retries=3. raise RetryError(e, request=request) requests.exceptions.RetryError: HTTPSConnectionPool(host='trends.google.com', port=443): Max retries exceeded with url: /trends/api/explore?hl=en-....Caused by ResponseError('too many 429 error responses'))
Ok I might have found a work around. It appears that now, the first connection to google trends returns a 429, so if you setup the object with :
pytrend = TrendReq(retries=3)
It should work. I've tested it on my side and no more 429
Im getting
HTTPSConnectionPool(host='trends.google.com', port=443): Max retries exceeded with URL: [URL] Caused by ResponseError('too many 429 error responses'))
I've heard that implementing cURL impersonate works to overcome the captcha issue
Can someone try that and let us know?
Syndorik Brooo.... You are Insane. Really thankful to you man. I had to submit a project tomorrow based on this and now everything is working properly.
Wait, this worked for you? I'm still getting 429.
Can you show your code?
pt = TrendReq(retries=3) pt.build_payload(terms) df = pt.interest_over_time()
This works
Ok I might have found a work around. It appears that now, the first connection to google trends returns a 429, so if you setup the object with :
pytrend = TrendReq(retries=3)
It should work. I've tested it on my side and no more 429
This didn't work for me.
What I did find working (unsurprisingly, yet still possibly useful for some), is replacing the headers in the request_args
with a valid cookie from browser.
For those who need it working "now" (like @Jinoy-Varghese) and the retries doesn't work, try inspecting element on the trends page, go to network, click on a ?geo
request and copy the cookie into your construction of TrendReq
.
Should look like:
p = TrendReq(request_args = {'headers': {'Cookie': NID COOKIE HERE}})
Currently looking for a more consistent solution that isn't just spamming the google servers with more retries.
Ok I might have found a work around. It appears that now, the first connection to google trends returns a 429, so if you setup the object with :
pytrend = TrendReq(retries=3)
It should work. I've tested it on my side and no more 429This didn't work for me.
What I did find working (unsurprisingly, yet still possibly useful for some), is replacing the headers in the
request_args
with a valid cookie from browser.For those who need it working "now" like @Jinoy-Varghese and the retries dosn't work, try inspecting element on the trends page, go to network, click on a
?geo
request and copy the cookie into your construction ofTrendReq
.Should look like:
p = TrendReq(request_args = {'headers': {'Cookie': NID COOKIE HERE}})
Isn't there a risk of running into issues doing this?
borrowing cookies is not my preferred method
@nicktba Yeah definitely not a long term solution. Just was saying in case someone needed a solution rn for a school project like Jinoy. Especially when there isn't a fix implemented yet.
@maxwnewcomer Can you try implementing cURL impersonate into your payload?
The guys over @ SERPAPI have been using it to resolve their issues
@maxwnewcomer Can you try implementing cURL impersonate into your payload?
The guys over @ SERPAPI have been using it to resolve their issues
Will do, how did you hear about the SERPAPI process? Cool intel.
Thanks to @nicktba I have the curl impersonate working (no retries and no cookies needed). Some of the curl_cffi session methods are different than the normal requests module, so will do some updating to the _get_data() method and hopefully push a fix soon.
New functionality will be ability to impersonate:
chrome99
chrome100
chrome101
chrome104
chrome107
chrome110
chrome99_android
edge99
edge101
safari15_3
safari15_5
Side effect of this push will be including a new required package, curl_cffi.
Thanks to @nicktba I have the curl impersonate working (no retries and no cookies needed). Some of the curl_cffi session methods are different than the normal requests module, so will do some updating to the _get_data() method and hopefully push a fix soon.
New functionality will be ability to impersonate:
chrome99 chrome100 chrome101 chrome104 chrome107 chrome110 chrome99_android edge99 edge101 safari15_3 safari15_5
Side effect of this push will be including a new required package, curl_cffi.
Amazing! Thanks Max!
Im going to send you an email, lets chat!
I'm a little confused by the changes google made. It seems like they wanted to make it harder to scrape their trends data, but when you look at the response from browser and from the impersonate enabled version of pytrends, the user type on browser is USER_TYPE_LEGIT_USER
and the response from the working pytrends is USER_TYPE_SCRAPER
. This indicates that they know it's scraping, but don't care? Yet, it still breaks the pytrend scraping. Kind of odd.
I'm a little confused by the changes google made. It seems like they wanted to make it harder to scrape their trends data, but when you look at the response from browser and from the impersonate enabled version of pytrends, the user type on browser is
USER_TYPE_LEGIT_USER
and the response from the working pytrends isUSER_TYPE_SCRAPER
. This indicates that they know it's scraping, but don't care? Yet, it still breaks the pytrend scraping. Kind of odd.
I think, for now at least, they are just trying to categorize those who are scrapers and those who are not.
It's likely in the near future they will use this to forecast and integrate an API credit system or block scraping overall.
The user-type update was launched earlier this year and disrupted a wide range of unofficial APIs. PyTrends included.
@nicktba Funny... I wonder how many paid/open-sourse SEO and SERP tools they broke and will break in the upcoming years.
Also update on the fix... I could push a fix without retries working rn, but would like to get that figured out first (along with tests). Worst case if I don't hear from the curl_cffi community in a bit I will just add an "impersonate" flag people can add to the TrendReq
constructor that will flip functionality from the normal request
module to cURL Impersonate.
Also update on the fix... I could push a fix without retries working rn, but would like to get that figured out first (along with tests). Worst case if I don't hear from the curl_cffi community in a bit I will just add an "impersonate" flag people can add to the
TrendReq
constructor that will flip functionality from the normalrequest
module to cURL Impersonate.
Awesome! Ill patiently wait for that update.
Thanks for your effort
Just opened that PR, should work for basic usage. No testing, retries, or confirmed proxy usage with that code. still wip.
I believe I found a much simpler solution to @maxwnewcomer's. The request made in GetGoogleCookie
is a GET
, but it responds with a empty 200
response. But if you instead make do a POST
, the API correctly responds with a cookie. The fix is simply to change this line to be requests.post
.
Hahahah sick @jesvinc !! Funny how it can be that simple. I do however think the cURL impersonate functionality might be nice to have in the future. I can add that change to my PR or create your own, up to you!
Haha yeah, I was shocked to find that out. Since it's so simple, I'm fine with you adding that change to your PR. I'll just add a code comment to it.
Tried @MartinNowak's patch but the issue still persists. I've got 50 keywords out of which it returns data for 1 and then throws the error,
requests.exceptions.RetryError: HTTPSConnectionPool(host='trends.google.com', port=443): Max retries exceeded with url: /trends/api/explore?hl=en-GB&tz=360&req=%7B%22comparisonItem%22%3A+%5B%7B%22keyword%22%3A+%22accessorize%22%2C+% 22time%22%3A+%222018-07-01+2023-03-12%22%2C+%22geo%22%3A+%22US%22%7D%5D%2C+%22category%22%3A+0%2C+%22property%22%3A+%22%22%7D (Caused by ResponseError('too many 429 error responses'))
Has anyone else been able to resolve this successfully?
I am also facing the issue since last 4 days
@AbhishekThalanki and @arthii17, feel free to pull my fork on the pull request #563. The impersonate feature has seemed to work for me.
face the same issue, adding all local cookies to TrendReq(requests_args=..) seems to fix it temporarily
Same issue here.
I believe I found a much simpler solution to @maxwnewcomer's. The request made in
GetGoogleCookie
is aGET
, but it responds with a empty200
response. But if you instead make do aPOST
, the API correctly responds with a cookie. The fix is simply to change this line to berequests.post
.
I would like to confirm that the proposed solution has worked for me perfectly. Thank you @maxwnewcomer!
Please see my comment if you need a temporary workaround to make your code work until the fix has been added to the library.
Have tried above solutions but seem to still get 429s consistently
The solution posted by @ckosmic here works consistently for me.
In the pytrends request.py file, at line 76 and 89, insert explore before /?geo.
so f'{BASE_TRENDS_URL}?geo={self.hl[-2:]}',
becomes so f'{BASE_TRENDS_URL}explore/?geo={self.hl[-2:]}',
@totencrab Thanks for this bit of information, it was really helpful! 👍
Any estimate on when the package will be updated with a fix? Thanks.
for ppl who want a fix now, you can use a selenium webdriver to visit the webpage once and extract cookie, then add it into TrendReq()
def get_cookie():
options = webdriver.ChromeOptions()
options.add_argument("--headless")
driver = webdriver.Chrome(options=options)
driver.get("https://trends.google.com/")
time.sleep(5)
cookie = driver.get_cookie("NID")["value"]
driver.quit()
return cookie
nid_cookie = f"NID={get_cookie()}"
pytrends = TrendReq(
...
requests_args={"headers": {"Cookie": nid_cookie}}
)
@totencrab Thanks.
I believe I found a much simpler solution to @maxwnewcomer's. The request made in
GetGoogleCookie
is aGET
, but it responds with a empty200
response. But if you instead make do aPOST
, the API correctly responds with a cookie. The fix is simply to change this line to berequests.post
.
I just wanted to thank you for this as it completely solve the issue I had.
@danielfree this worked straight away for me, thanks! 👍
Preceded with
from selenium import webdriver
import time
to save people about 2 seconds 😄
I believe we resolved this with #570 which is now included in the v4.9.1 release. Thank you all for helping uncover the underlying issues!
I believe I found a much simpler solution to @maxwnewcomer's. The request made in
GetGoogleCookie
is aGET
, but it responds with a empty200
response. But if you instead make do aPOST
, the API correctly responds with a cookie. The fix is simply to change this line to berequests.post
.I would like to confirm that the proposed solution has worked for me perfectly. Thank you @maxwnewcomer!
Please see my comment if you need a temporary workaround to make your code work until the fix has been added to the library.
I encountered this problem again. The version of pytrends is 4.9.2, but changing 'get' to 'post' in GetGoogleCookie function works.
I believe I found a much simpler solution to @maxwnewcomer's. The request made in
GetGoogleCookie
is aGET
, but it responds with a empty200
response. But if you instead make do aPOST
, the API correctly responds with a cookie. The fix is simply to change this line to berequests.post
.
thanks for the solution. I was going to lose my mind over this
I believe I found a much simpler solution to @maxwnewcomer's. The request made in
GetGoogleCookie
is aGET
, but it responds with a empty200
response. But if you instead make do aPOST
, the API correctly responds with a cookie. The fix is simply to change this line to berequests.post
.
Hi! does anyone know if this solution still works? im using pytrends 4.9.2
I believe I found a much simpler solution to @maxwnewcomer's. The request made in
GetGoogleCookie
is aGET
, but it responds with a empty200
response. But if you instead make do aPOST
, the API correctly responds with a cookie. The fix is simply to change this line to berequests.post
.Hi! does anyone know if this solution still works? im using pytrends 4.9.2
no it does not work, I was using it in my project and it was working fine but the next day its gives me the Too many Request error code 429
I believe I found a much simpler solution to @maxwnewcomer's. The request made in
GetGoogleCookie
is aGET
, but it responds with a empty200
response. But if you instead make do aPOST
, the API correctly responds with a cookie. The fix is simply to change this line to berequests.post
.Hi! does anyone know if this solution still works? im using pytrends 4.9.2
no it does not work, I was using it in my project and it was working fine but the next day its gives me the Too many Request error code 429
same here, it is not working for me.
I believe I found a much simpler solution to @maxwnewcomer's. The request made in
GetGoogleCookie
is aGET
, but it responds with a empty200
response. But if you instead make do aPOST
, the API correctly responds with a cookie. The fix is simply to change this line to berequests.post
.Hi! does anyone know if this solution still works? im using pytrends 4.9.2
no it does not work, I was using it in my project and it was working fine but the next day its gives me the Too many Request error code 429
same here, it is not working for me.
I found it works sometimes, it works for few weeks ago, but now is not working again
Hii, Google trend start blocking again. It is observed that this month we are not able to scrape data properly and it start working fine for 3-4 hours but again showing 429 error or malformed error.
Im getting the following error:
Exception occurred: The request failed: Google returned a response with code 429.
I think this could be because Google has a new trends website?
https://searchengineland.com/google-launches-new-google-trends-portal-394026
Is there a way to fix this issue? Thanks in advance