Open thejcannon opened 1 year ago
I'm not sure I agree that this is a desirable goal. Like any other codebase, we commonly use Python libraries. These are not tools we invoke via a subprocess, so I don't see the analogy to those.
If the purpose is to avoid a pip resolve at install time, this is what scie is for. In a pants-scie world, we would deploy a binary that contains all the 3rdparty deps already baked into the file.
I think the energy should be going in to pants-as-a-scie. That gives us, I think, the wins this issue is striving for, as well as cutting pypi out of our deploy entirely, in favor of github releases. One download and done!
I don't think the scie solution solves the size or security concerns, and surely doesn't help with the issue of conflicting dependencies (like we recently got hurt by the requests library colliding with in-repo plugins).
Consider the reason we don't add new dependencies today (at the top of requirements.txt
). Would we have added our current deps if they weren't already there?
Also imagine if we didn't need to build Pants as a PEX for scie to work. The wheel was all you needed...
Sure, it is better not to need external deps in a vacuum. But we use them! The alternatives I see are to replicate functionality in our codebase that we could be using from someone else's, which doesn't seem like a great tradeoff except in trivial cases. Or to remove functionality. Or am I missing a third option here? I suppose more binding to Rust functionality (e.g., to replace requests)? That could make sense, and improve performance.
The third-party plugin issue is indeed thorny. Reducing our own deps will mitigate that, and is a good reason to pursue this (why not mentioned above though...) We can't fully solve this though - if you consume two third-party plugins (say not provided by us) their requirements can collide! But I agree that we are the overwhelmingly dominant provider of plugins for now, so our own footprint is the main problem.
And is size actually an issue? Some numbers would be instructive.
Yes, another option is use rust facilities, which has its own tradeoffs, but more stomachable.
And the last is Pex. Some of our dependencies could be provided by shifting to a PEX process.
And the last is Pex. Some of our dependencies could be provided by shifting to a PEX process.
For example? These would have to be cases where process invocation overhead is tolerable. I guess network requests might fall under that category.
I don't think 0 dependencies is a reachable goal, but agree with keeping the number of deps low (as low as sensible) and periodically actively reviewing which deps we have is probably a good idea.
OK, here's my homework:
Scraped from: `pants paths --from=src/python/pants:pants-packaged --to=3rdparty/python/requirements.txt | grep # | sort | uniq` | Name | Purpose | Backends | Removal Strategy | Difficulty/Risk | |
---|---|---|---|---|---|---|---|---|---|
ansicolors | Terminal Colorer | (core) | just in-source | Low/Low | |||||
chevron | templating | Go | string.Template or f-strings |
Low/Low | |||||
fasteners | locks | (core) | in-source maybe? | ???/Medium | |||||
ijson | json parser | Go | json stdlib, otherwise Rust-based |
(Depends?)/Low | |||||
importlib-resources | resource-loading | (core) | stdlibrary (yay Py3.9!) | Low/Low | |||||
node-semver | version comparison | JS | ??? | ??? | |||||
packaging | version comparison | (core), Python | ??? | ???/High | |||||
pex | (We literally use it just to get PEX_PYTHON_PATH from RC files) |
Python | In-Source or use a process | Low/Low | |||||
psutil | Process utilities | (core) | Move to Rust | ???/High | |||||
python-lsp-jsonrpc | LSP Support | (BSP) | ??? | ???/??? | |||||
PyYAML | YAML loading/dumping | Helm, JS, OpenAI, (core) | Move to Rust | Medium/Low | |||||
setproctitle | Set/Get Process title | (core) | In-Source or Move to Rust | Low/Low | |||||
setuptools | Resource loading, Requirement Parsing | Java, JS, Python, (core), JVM | Use importlib for resources. Reqs use ??? |
???/High | |||||
toml | TOML support | Python, BSP, JVM, (core) | Move to Rust | Medium/Low | |||||
types-PyYAML | typing | N/A | Just exclude... | Low/Low | |||||
types-setuptools | typing | N/A | Just exclude... | Low/Low | |||||
types-toml | typing | N/A | Just exclude... | Low/Low | |||||
typing-extensions | Future typing shenanigans | Docker, Helm, JS, (core), Python | Case-by-case, newer Python helps a bunch | Medium/Low |
There's some that I couldn't guess the strategy or difficulty off-the-bat.
So some things I see:
scie-pants
was a leap, since we can use more modern Pythons. Resource loading, typing, and maybe TOML support all get stdlib support (depending on if we upgrade again)packaging
will be a bitch to upgrade, but it's mostly (all?) version parsing. We might consider vendoring that code?So, that puts us in a place where I think if we figure out the more hairy ones, we really could have 0 3rdparty reqs.
I'd also argue it's probably worth picking the low-hanging fruit so the list is as small as possible :smile: I'm happy to do that myself. Hell, I'm happy to do all of this work.
(FWIW importlib_resources
can and should be removed: https://github.com/pantsbuild/pants/pull/19339)
Is your feature request related to a problem? Please describe. In an ideal world, Pants-the-wheel has no 3rdparty reliance. Everything is either provided through static Rust code in the engine, or we use Pex to download and install tools (like we do today).
This would reduce Pants' installation footprint, increase security, and would be a slight bump in first-time installation of Pants.
Describe the solution you'd like The pantsbuild wheel requires 0 3rdparty deps.
Describe alternatives you've considered N/A
Additional context :taco: