GeminidSystems / GoogleNewsScraper

A Python package that scrapes Google News article data while remaining undetected by Google. Our scraper can scrape page data up until the last page and never trigger a CAPTCHA (download stats: https://pepy.tech/project/GoogleNewsScraper)
https://pypi.org/project/GoogleNewsScraper/
MIT License
11 stars 5 forks source link

Stability Enhancement - Replace selection by class name #2

Open karlgunst opened 3 years ago

karlgunst commented 3 years ago

@abnoviello23

For stability reasons, we want to replace the use of find_elements_by_class_name or div[contains(@class)], google changes the class names regularly and it breaks our script

We should be able to select what we need using one of the following

abnoviello23 commented 3 years ago

@karlgunst It looks like for this one, there are not many ids to go around, but I will change all of the class names we are using to full XPaths and tag Names.