scrapy / scrapyd

A service daemon to run Scrapy spiders
https://scrapyd.readthedocs.io/en/stable/
BSD 3-Clause "New" or "Revised" License
2.92k stars 569 forks source link

Cannot access environment variables in the spider running under ScrapyD #490

Closed iamumairayub closed 1 year ago

iamumairayub commented 1 year ago

So I am using Ubuntu 18.04 and have tried setting some envirement varialbes in /etc/envirement and also in .basrc file

After doing source myFile I can do echo $my_var and it shows variable correctly.

I ran my spider using scrapy crawl mySpider and environment variables shows just fine.

But when same spider is run under ScrapyD, the environment variable is empty.

I tried printing user with getpass.getuser() and it shows same user when I run scraper from terminal or from ScrapyD.

I saw this issue but it only says to restart ScrapyD, and have tried restarting ScrapyD and tried logging out and logging back in to terminal but no use.

How can I access environment variable in the Spider running under ScrapyD?

jpmckinney commented 1 year ago

It depends how you are running Scrapyd.

For example, I use systemd to create a Scrapyd service. systemd does not create a login shell, so it will not run a user's .bashrc file, etc. .bashrc and /etc/environment are intended for interactive shells.

Instead, I need to set Environment="myvar=myvalue" in the service file.

Whatever you're using to run Scrapyd probably has its own way to configure environment variables.

Here's my /etc/systemd/system/scrapyd.service file:

[Unit]
Description=Scrapyd
After=network.target

[Service]
User=scrapyd
Group=scrapyd
Environment="myvar=myvalue"
# More Environment lines...
WorkingDirectory=/home/scrapyd/scrapyd
ExecStart=/home/scrapyd/scrapyd/.ve/bin/scrapyd --nodaemon --logfile=/var/log/scrapyd/scrapyd.log

[Install]
WantedBy=multi-user.target
iamumairayub commented 1 year ago

I finally fixed my issue by using

[Service]
EnvironmentFile=/etc/environment