scrapy / scrapyd

A service daemon to run Scrapy spiders
https://scrapyd.readthedocs.io/en/stable/
BSD 3-Clause "New" or "Revised" License
2.92k stars 569 forks source link

Add support for environment variables #478

Closed Shleif91 closed 11 months ago

Shleif91 commented 1 year ago

Add support for environment variables. Very relevant for username and password

jpmckinney commented 1 year ago

What is the issue with storing the username/password in the configuration file?

jpmckinney commented 1 year ago

Duplicate #303

therealpurplemana commented 1 year ago

Would also like this feature. It is necessary for CI/CD deployment.

jpmckinney commented 1 year ago

You can do CI/CD deployment without Scrapyd having support for environment variables, like in https://github.com/scrapy/scrapyd/issues/303#issuecomment-926240350

therealpurplemana commented 1 year ago

Our scrapy script uploads to AWS S3. Wouldn't this method require writing our keys into a plaintext file? Likewise, if the scrapy relies on a database connection URL, then it would have to be written to plaintext for an egg to access it via scrapyd. If an attacker gains access to the server, it means secrets can be stolen.

jpmckinney commented 1 year ago

Plaintext files like /etc/shadow or private SSH keys stored in .ssh directories are not inherently insecure. You can use the filesystem's permissions and ownership features to protect such files.

If an attacker gains root access to your server, they can just as easily run cat /proc/PID/environ to read the environment variables from any process (change PID to the process ID). Environment variables are not any more secure.

jpmckinney commented 11 months ago

Closing as need for envvar support (versus writing files) is unclear.