tasos-py / Search-Engines-Scraper

Search google, bing, yahoo, and other search engines with python
MIT License
515 stars 137 forks source link

Google text attribute is empty #19

Closed sershev closed 3 years ago

sershev commented 3 years ago

Many thanks again for your work, it works great. Recently I noticed that if searching with Google the text field is empty for all results.

se = Google()
res = se.search("news", 1)
res.results()

[{'host': 'bbc.com',
  'link': 'https://www.bbc.com/news/world',
  'title': 'World - BBC Newshttps://www.bbc.com › news › world',
  'text': ''},
 {'host': 'edition.cnn.com',
  'link': 'https://edition.cnn.com/world',
  'title': 'World news – breaking news, videos and headlines - CNNhttps://edition.cnn.com › world',
  'text': ''},
 {'host': 'theguardian.com',
  'link': 'https://www.theguardian.com/world',
  'title': 'Latest news from around the world | The Guardianhttps://www.theguardian.com › world',
  'text': ''},
 {'host': 'hindustantimes.com',
  'link': 'https://www.hindustantimes.com/world-news',
  'title': 'World News, Latest World News, Breaking News and ...https://www.hindustantimes.com › World News',
  'text': ''},
 {'host': 'reuters.com',
  'link': 'https://www.reuters.com/news/archive/worldNews',
  'title': 'World News Headlines | Reutershttps://www.reuters.com › news › archive › worldNews',
  'text': ''},
 {'host': 'abcnews.go.com',
  'link': 'https://abcnews.go.com/International/',
  'title': 'International News | Latest World News, Videos & Photos ...https://abcnews.go.com › International',
  'text': ''},
 {'host': 'news.sky.com',
  'link': 'https://news.sky.com/world',
  'title': 'World News - Breaking international news and headlines | Sky ...https://news.sky.com › world',
  'text': ''},
 {'host': 'nytimes.com',
  'link': 'https://www.nytimes.com/section/world',
  'title': 'World News - The New York Timeshttps://www.nytimes.com › section › world',
  'text': ''},
tasos-py commented 3 years ago

Thanks for letting me know. They changed their HTML structure and the CSS selector we had stopped working. It's fixed now.

Jefferson111 commented 3 years ago

Recently, Google HTML structure for each page seems to change and evolve as more queries is submitted. More and more of the text fields become empty with the increase usage of the Google search engine. Currently, span > span works decently, but there are times where entire batches of data have their text field completely empty. Will there be any plans to fix this/explore the different combinations of HTML structure Google can generate?