scrapy / scrapyd-client

Command line client for Scrapyd server
BSD 3-Clause "New" or "Revised" License
770 stars 146 forks source link

Add an option to build self-contained eggs #79

Closed mxdev88 closed 2 years ago

mxdev88 commented 3 years ago

Currently, eggs are built and uploaded to scrapyd without their depedencies. This means dependencies have to be installed on the the scrapyd server for the spiders to run . If the scrapyd servers runs multiple projects that rely on different versions of dependencies, it can quicky get messy.

What do you think about creating a new option --include-deps that will bundle dependencies within the egg?

jpmckinney commented 3 years ago

@mxdev88 That would indeed be great!

I am a new maintainer to the project, and will be happy to review any PR.

jpmckinney commented 3 years ago

Is this branch working? https://github.com/mxdev88/scrapyd-client/tree/build-with-dependencies

jpmckinney commented 3 years ago

Another solution: https://github.com/scrapy/scrapyd/pull/269 which included https://github.com/cleocn/scrapyd/commit/afce5a826bd0b8f4b179481b02167e44f1b450f4 mentioned in https://github.com/scrapy/scrapyd/issues/246#issuecomment-356868372. However, it looks like maintainers don't prefer that solution.

mxdev88 commented 3 years ago

Is this branch working? https://github.com/mxdev88/scrapyd-client/tree/build-with-dependencies

hey @jpmckinney I think it does work but I will need to redo some tests. If I remember correctly I did not make a PR because it requires a new dependency (not sure you want that) and relies on requirements.txt (not sure if it's the best solution). But if you could take a look and give comments, happy to make ajustements when the time allows me.

Btw, great news scrapyd and scrapyd-client found a new maintainer!

new-wxw commented 3 years ago

@mxdev88 Is this branch working? https://github.com/mxdev88/scrapyd-client/tree/build-with-dependencies ,I couldn't build for local use,but No abnormal。

divtiply commented 3 years ago

You also can use https://pypi.org/project/uberegg/ to build egg containing code with dependencies, then deploy this egg to scrapyd with its REST API or scrapyd-deploy.

mxdev88 commented 2 years ago

You also can use https://pypi.org/project/uberegg/ to build egg containing code with dependencies, then deploy this egg to scrapyd with its REST API or scrapyd-deploy.

@divtiply this one was not available when I looked into this feature. Honestly, I was wondering whether it would be preferable to rely on install_requires in setup.py (which is present in scrapy project) or setup.cfg to list dependencies rather than a requirements.txt file for reasons explained in install-requires-vs-requirements. Do you think this feature could be added to uberegg? If so, I think it could be a could be a good reason to switch before making a PR.

divtiply commented 2 years ago

@mxdev88 setup.py install_requires should normally contain all project dependencies, including scrapy. The uberegg deployment intended to contain only project files and its dependencies that are additional to ScrapyD stack, thus no scrapy/twisted/etc. This is similar to scrapinghub's shub deployment, https://shub.readthedocs.io/en/stable/deploying.html#deploying-dependencies

mxdev88 commented 2 years ago

hey @divtiply, I see your point. However, scrapyd-client builds a default setup.py during its build step for the purpose of building the egg (when setup.py does not exist) c.f. https://github.com/scrapy/scrapyd-client/blob/master/scrapyd_client/deploy.py And I dont think setup.py here is intented to contain all dependencies, only the ones used by the project that are additional to scrapyd stack, hence why I believed it could be a better option.

@jpmckinney do you have an opinion?

mxdev88 commented 2 years ago

@mxdev88 Is this branch working? https://github.com/mxdev88/scrapyd-client/tree/build-with-dependencies ,I couldn't build for local use,but No abnormal。

@new-wxw I've made a PR #104. It is working for me. Could you give a try and let us know?

@jpmckinney could you review #104 and comment?

jpmckinney commented 2 years ago

@mxdev88 #104 looks good. Is there any notable difference between pyassembly and uberegg?

I also wonder whether the requirements ought to be in setup.py/setup.cfg – do any dependencies like pyassembly support that? That said, preferences around how to declare dependencies are wide ranging (setup.cfg, pyproject.toml, etc.) so in the long-term (if there's user demand) we'd probably end up supporting requirements.txt plus these others anyway.

Update: Regarding https://packaging.python.org/en/latest/discussions/install-requires-vs-requirements/, requirements.txt is frequently used to pin dependencies, which is what you want when you deploy spiders, so I think it's the appropriate choice for a first implementation.

divtiply commented 2 years ago

Is there any notable difference between pyassembly and uberegg?

Author of uberegg here. Uberegg was born from frustration on pyassembly. uberegg uses pip recommended way of calling pip install while pyassembly uses not supported pip internal calls. Uberegg uses standard setuptools dist dir to make easy to cleanup while pyassembly uses its own non-standard pyassembly_dist.

mxdev88 commented 2 years ago

Is there any notable difference between pyassembly and uberegg?

I initially only known of pyassembly. @divtiply proposed to use uberegg instead. I'm pretty indifferent as long as it does the job. However switching would require to redo the tests but shouldn't be a big thing. @jpmckinney do let me know if you would prefer going for uberegg instead. I can update the PR or make a new one.

I also wonder whether the requirements ought to be in setup.py/setup.cfg – do any dependencies like pyassembly support that? That said, preferences around how to declare dependencies are wide ranging (setup.cfg, pyproject.toml, etc.) so in the long-term (if there's user demand) we'd probably end up supporting requirements.txt plus these others anyway.

Neither pyassembly nor uberegg support creating eggs including deps from setup.py/setup.cfg as far as i know.

Update: Regarding https://packaging.python.org/en/latest/discussions/install-requires-vs-requirements/, requirements.txt is frequently used to pin dependencies, which is what you want when you deploy spiders, so I think it's the appropriate choice for a first implementation.

Makes sense to me. Let's stick to requirements.txt file only for now.

jpmckinney commented 2 years ago

Thanks, @divtiply ! My considerations:

So, @mxdev88, I’d prefer using uberegg, as we might have to anyway to support future pip versions.

mxdev88 commented 2 years ago

@jpmckinney updated #104 to uberegg if you want to have another look. If good for you, good for me to merge. Otherwise let me know if anything else. thanks

jpmckinney commented 2 years ago

Closed by #104 🎉 🚀