Closed sif-gondy closed 2 years ago
I actually just was wrestling with this this weekend regarding the when= parameter. Stuff from months ago when when = '3d'
In my browser doing the query on their site itself (in the u.s.) the when= worked fine.
Yes, upon closer look it's definitely the time_query formatting with the period parameter
def _ceid(self):
time_query = ''
if self._start_date or self._end_date:
if inspect.stack()[2][3] != 'get_news':
warnings.warn(message=("Only searches using the function get_news support date ranges. Review the "
f"documentation for {inspect.stack()[2][3]} for a partial workaround. \nStart "
"date and end date will be ignored"), category=UserWarning, stacklevel=4)
if self._period:
time_query += 'when%3A'.format(self._period) #<---- maybe the "when%3A" formatting with the period string
if self._period:
warnings.warn(message=f'\nPeriod ({self.period}) will be ignored in favour of the start and end dates',
category=UserWarning, stacklevel=4)
if self.end_date is not None:
time_query += '%20before%3A{}'.format(self.end_date)
if self.start_date is not None:
time_query += '%20after%3A{}'.format(self.start_date)
elif self._period:
time_query += 'when%3A'.format(self._period)
return time_query + '&ceid={}:{}&hl={}-{}&gl={}'.format(self._country,
self._language,
self._language,
self._country,
self._country)
It works perfectly if I provide language='fr' and country='FR' and omit a period. So you can disregard my comment about Google updating their available countries/languages.
Oh interesting -- so the time filters will work if language and country are specified?
No.., pretty much every time a time_query is passed, being period or _end_date/startdate, the get_news method returns very few outputs (or even None for very popular keywords in 5y searches) and/or outside the desired time range. For example a simple request:
from gnews import GNews
from pprint import pprint
google_news = GNews(language='en', country="US", period="7d")
news = google_news.get_news('Pakistan')
pprint(news, indent=4)
Returns >
[ { 'description': "Babar Azam: 'Pakistan's lower order falling cheaply "
"was disappointing' ESPNcricinfo",
'published date': 'Thu, 28 Jul 2022 07:00:00 GMT',
'publisher': { 'href': 'https://www.espncricinfo.com',
'title': 'ESPNcricinfo'},
'title': "Babar Azam: 'Pakistan's lower order falling cheaply was "
"disappointing' - ESPNcricinfo",
'url': 'https://www.espncricinfo.com/story/sl-vs-pak-2022-2nd-test-babar-azam-says-pakistans-lower-order-falling-cheaply-was-disappointing-1326527'}]
(Note the date is outside the desired range)
Not only en English-US but also for other languages as well. If you remove any time_query parameter you get the full output though so this bug has to be about a change in the url formatting regarding those:
time_query += '%20before%3A{}'.format(self.end_date)
time_query += '%20after%3A{}'.format(self.start_date)
time_query += 'when%3A'.format(self._period)
Shouldn't this line have curly brackets somewhere for the .format()?:
time_query += 'when%3A'.format(self._period)
hopefully someone can chime in here to confirm @sif-gondy's suspicions.
This issue has been fixed, https://github.com/ranahaani/GNews/pull/41 Thanks @sif-gondy
@ranahaani Hello, period still isnt working for me
Hello,
Thanks for providing this piece of code.
I have recently come across weird behavior regarding the period parameter (e.g. 7d, you can get news from weeks prior). More importantly, the number of news output have dramatically reduced recently when combining countries and languages or even just providing a language and leaving the country parameter to None (for English)
Turning language parameter to any other language (e.g. French ['fr']) returns 0 articles systematically even for popular searches.
I suspect Google has changed/updated their url format and/or available countries/languages !