Closed sujaymansingh closed 3 months ago
I guess that it is twisted that actually changes the working directory to --rundir
?
See https://twistedmatrix.com/trac/ticket/2572 Does updating twisted solve this? You can work around it by using absolute paths for eggs_dir, logs_dir, dbs_dir (and items_dir if used).
If this is indeed caused by the above bug we can't wait for them to fix it because they first have to discuss and decide between the 2 behaviours. In this case it would worth a workaround, either overriding the default twisted app argument parsing or making scrapyd use the rundir option when preparing paths.
Since this is the only issue report about --rundir
in 10 years, I am simply removing it as an option. 9fa4091
If deployed using systemd, for example, WorkingDirectory=
can be used, instead.
Hi
I run scrapyd with the
--rundir
option. (version 1.0.1)I have the following issue.
listprojects.json
)listprojects.json
$rundir/eggs
I suspect the issue is to do with when the directory is changed.
It seems that
SpiderScheduler
will load the eggs/projects when initialised https://github.com/scrapy/scrapyd/blob/1.0.1/scrapyd/scheduler.py#L12But I think this is done before changing the working directory to
--rundir
.To investigate, I added a couple of hacky print statements to
SpiderScheduler
And grepping logs for "SpiderScheduler" (after I restart and then make a call in my browser to
listprojects.json
)So
/opt/skuscraper
is my project directory (with thescrapyd.conf
). But I want the working directory to be separate (so it doesn't put any extra files in the app directory), that is why I use/var/scrapyd
as the run dir.We can see that when the SpiderScheduler object is init'd, the current dir is
/opt/skuscraper
, so it can't find any eggs. But after the app starts up, it uses/var/scrapyd
.So any deploys of eggs after the app starts up are saved to
/var/scrapd/eggs
, but then scrapyd is restarted, it loads its initial list of eggs from/opt/skuscraper
(where they won't exist).