jontewks / puppeteer-heroku-buildpack

Installs dependencies needed in order to run puppeteer on heroku.
MIT License
486 stars 303 forks source link

Usage issue #46

Closed lynxionxs closed 5 years ago

lynxionxs commented 5 years ago

I've got this running on heroku. But in Acceptable Use Policy line 21 of Prohibited Actions , they say we are not allowed to use such tools on heroku for scraping websites. Am i missing something?

Prohibited Actions:
21. Use the Service to access a third party web property for the purposes of web scraping, web crawling, web monitoring, or other similar activity through a web client that does not take commercially reasonable efforts to:
identify itself via a unique User Agent string describing the purpose of the web client; and
obey the robots exclusion standard (also known as the robots.txt standard), including the crawl-delay directive;
jontewks commented 5 years ago

Puppeteer is used for more than just scraping websites, its a tool, and its up to you how you use it. Also from the quoted text:

identify itself via a unique User Agent string describing the purpose of the web client; and obey the robots exclusion standard (also known as the robots.txt standard), including the crawl-delay directive;

So if that is how you want to use it, thats what you need to do to follow Heroku's rules.

Also this repo is just the buildpack for getting puppeteer running on Heroku and nothing more, so questions about Heroku's acceptable use might be better directed towards them.