Open Brummerling opened 8 months ago
If the timeframe you are looking pulling data for is older than 30 days, Google does not provide real time data, but a sample. See here: https://support.google.com/trends/answer/4365533?hl=en
If you search for low frequency terms, the sample variance can be pretty significant. You should therfore pull multiple samples for the same keywords and timeframe until the sample variance is reduced to an acceptable level.
Similar to what @Brummerling is seeing, I am getting a large number of 0 results for my gtrends queries as well. I used the search term "Google" as a test example, and found a period of all 0s from the start of 2020 to early-2021.
This was my gtrends call:
gt<-gtrends('Google',time='2020-01-01 2023-08-31', geo='US', category=0)
gt_trend<-gt$interest_over_time
and this was the plot I created from my gtrends interest over time (red) versus downloaded google search trends from the website for the exact same term, geo, and time period (blue).
I understand that the gtrends results are a sample of data and so the data may vary, but from my understanding, it shouldn't be pure 0s for a year +, especially not for a popular search term like "Google" itself.
I tried running this many times over and got the same result between yesterday and today.
I will try to have a look this week.
Building on my earlier discrepancy, I am finding that some US subnational locations are being returned with no data. I have been using the keyword "flu" and was searching for Texas alone using the following code (as a working example) which returned no data.
gt<-gtrends('flu',time='2020-01-01 2023-08-31', geo='US-TX', category=0)
gt_trend<-gt$interest_over_time
On the Google Trends website, we do see data for Texas for that same time period with the search term "Flu".
To confirm that I wasn't using US geos wrong, I ran "Google" for Texas and it did return data, though again there was a similar discrepancy to the national level search for "Google' (gtrends results in Red, download from Google Search Trends in Blue):
Surprisingly, the national-level for "flu" worked as expected and matched Google Trends, suggesting that these issues are intermittent and inconsistent.
suggesting that these issues are intermittent and inconsistent.
The worst part, really, is that we have no public API to access here and hence really no real leg to stand on to complain to Google. The result data is ... what they give us and that is that. (Modulo possible errors in the REST request string but that is mostly ironed out by now.)
How did you get the red curve in your last plot? (I,e, what parameters have you used within gtrendsR
).
I used gtrends("flu", geo='US', time= ('2020-01-01 2023-08-31'))
Not sure to understand,
This: gtrends("flu", geo='US', time= ('2020-01-01 2023-08-31'))
matched the data you searched for Texas directly on the webpage?
Nope - the last plot was of "Flu" in the US specifically.
My search query for Texas (which yielded no results) was gt<-gtrends('flu',time='2020-01-01 2023-08-31', geo='US-TX', category=0)
We'll get in contact with Google. Hopefully, they have some information I can share here.
Same here, lots of zeros that are not in the .csv from the Google Trends website. Also pytrends (which is the Python equivalent) gives me similar data. So I think this is a problem on the Google side.
My R request does not equal the GTrends Online request with this command: mytest <- gtrends(keyword = "KiK", geo = "DE", time = "2023-06-01 2023-07-31", onlyInterest = TRUE) There are a lot of zeros in the result. Also my today's R-request does not equal yesterday's request with same command.
What is going on here?
kik_R_jun_jul.xlsx