In the webscraper, some of the relevant websites on BRIGHTDATA's spreadsheet aren't being parsed. Single out these files and fix them individually.
UPDATE: When accessing these websites, it labeled me as an attacker and blocked the webscraper. Trying to use a browser-based scraper called selenium
UPDATE: Recreated the original web scraper using selenium and beautiful soup. New scraper is slower (about 100 sources every 10 mins) but seems to work with most urls
In the webscraper, some of the relevant websites on BRIGHTDATA's spreadsheet aren't being parsed. Single out these files and fix them individually.
UPDATE: When accessing these websites, it labeled me as an attacker and blocked the webscraper. Trying to use a browser-based scraper called selenium
UPDATE: Recreated the original web scraper using selenium and beautiful soup. New scraper is slower (about 100 sources every 10 mins) but seems to work with most urls