kotartemiy / pygooglenews

If Google News had a Python library
https://newscatcherapi.com
MIT License
1.26k stars 134 forks source link

My search results return fewer news than I expect, is this normal? #19

Open laurence-lin opened 3 years ago

laurence-lin commented 3 years ago

Thank you for the great tool! I would like to scrap large scale news data from google news, however when I use the keyword 'covid' to get the response for 48 months, I got only 100 news data. Is that normal? I don't think google news have that less data related to the topic, or does the API limits the amounts of response? Here is my code:

gn = GoogleNews()
search = gn.search("covid", when = '60m') # 設定關鍵字

all_news = search['entries']

print("There are totally {} news".format(len(all_news)))
jbxiang commented 2 years ago

The limited data per time is 100 constrained by Google.

astavri commented 2 years ago

workaround: If you loop each search by day and define day ranges earlier on, you are not constrained by the limits of what Google reports per search. You need datetime for this.

from datetime import datetime, timedelta

while min_date != max_date: #While loop conditions set to run dates from min to max, adding a day for each min1_date = min_date + timedelta(days=1) print("From:"+min_date.strftime('%Y-%m-%d')); print("To:"+min1date.strftime('%Y-%m-%d')); search = gn.search(searchlist[i], from=mindate.strftime('%Y-%m-%d'), to=min1_date.strftime('%Y-%m-%d'))