scrapy / scrapyd

A service daemon to run Scrapy spiders
https://scrapyd.readthedocs.io/en/stable/
BSD 3-Clause "New" or "Revised" License
2.92k stars 569 forks source link

Specify settings for spider that share root directories between projects #506

Closed Aaron2516 closed 1 month ago

Aaron2516 commented 1 month ago

I have a scrapy project with a directory structure similar to the following:

scrapy.cfg
setup.py
myproject/
    __init__.py
    ...
    settings/
       __init__.py
       base.py
       foo.py
       bar.py
    spiders/
        __init__.py
        foo.py
        bar.py
        ...

scrapy.cfg

[settings]
default = myproject.settings.base
foo = myproject.settings.foo
bar = myproject.settings.bar

setup.py

from setuptools import setup 
setup(
    ...
    entry_points={
        'scrapy': ['settings = myproject.settings.base']
    },
    ...
)   

When calling the schedule.json, is there a way to tell scrapyd to switch to the settings I need, for example, to start the foo spider use myproject.settings.foo, to start the bar spider use myproject.settings.bar instead of the entry_point field value of setup.py.

jpmckinney commented 1 month ago

This is more a question for Scrapy than for Scrapyd, since the setup.py file is read by Scrapy, not Scrapyd.

I suggest instead using spider arguments. Your spider's __init__ method can read those arguments, and then reconfigure itself accordingly.