Open macrael opened 7 years ago
Any thoughts on this? Could it be related to building the binary itself, not caching the library? Is there a way I can figure out what is happening for those 70 seconds?
Take a look at bazelbuild/rules_python#1. They just added support for pip dependencies, which are cached making rebuilds pretty snappy. Perhaps you could try on that to speed up builds? I think you'd need some changes to this repo for that to work though. Haven't quite yet sorted it out myself. 😑
oh shoot, I should have responded to this months ago. The issue you're seeing with rules_pex is that pex's own caching is disabled, because it's incompatible with bazel's concurrency. That is, if bazel happens to run two instances of pex at the same time, pex's cache ends up corrupted and builds will become inconsistent.
If you are able to use egg or wheel dependencies instead of having pex resolve requirements from pypi, it should be much faster.
Is there an example of using egg or wheel dependencies?
On Mon, Sep 18, 2017 at 2:56 PM, Benjamin Staffin notifications@github.com wrote:
oh shoot, I should have responded to this months ago. The issue you're seeing with rules_pex is that pex's own caching is disabled, because it's incompatible with bazel's concurrency. That is, if bazel happens to run two instances of pex at the same time, pex's cache ends up corrupted and builds will become inconsistent.
If you are able to use egg or wheel dependencies instead of having pex resolve requirements from pypi, it should be much faster.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/benley/bazel_rules_pex/issues/38#issuecomment-330367908, or mute the thread https://github.com/notifications/unsubscribe-auth/AADZz_ZU2ijoaj2YTpuKJ3fhRzeppDR4ks5sjucHgaJpZM4OIqaA .
I spent today trying to make this work, but unfortunately it seems that pex doesn't have support for manylinux1 wheels,( https://github.com/pantsbuild/pex/issues/281 ) which is what most things seem to distribute on pypi. As far as I can tell that means that I can't make this work on my own.
For reference, I tried to get it so that I could import grpc. See the wheels here: https://pypi.python.org/pypi/grpcio/1.6.0 the only available wheels are manylinux1's.
Before I realized that this might be the problem, I got this far:
WORKSPACE
http_file(
name = "pypi_grpcio",
urls = ["https://pypi.python.org/packages/d0/67/cccd0e58d169cc7077425b296056b553acee7a8fe45ad8e52dce2fe66ab7/grpcio-1.6.0-cp35-cp35m-manylinux1_x86_64.whl"],
)
... repeat for setup tools, protobuf, and six
BUILD
pex_binary(
name = "smoketest",
interpreter = "/usr/bin/python3",
main = "smoketest.py",
pex_use_wheels = True,
eggs = [
"@pypi_setuptools//file",
"@pypi_six//file",
"@pypi_protobuf//file",
"@pypi_grpcio//file",
],
)
And the error I got when I run it:
Failed to execute PEX file, missing compatible dependencies for:
protobuf
grpcio
Does bazel use multiple threads by default? It's pretty hampering to have to rebuild all reqs from source on every build. Makes local development essentially impossible.
I also tried to use the new rules_python, but found it impossible to get them to use python3. Any pointers there would also be appreciated.
Bazel can be so frustrating!
Is it possible to create a different build rule that builds these, then import that so as to create my own cache?
I think the way to do it is probably comparable to what rules_python does: have a rule for each external dependency that builds a pex containing just that dep (and maybe its transitive deps?), and add a way to roll several pex archives together into a final pex_binary that includes all of them. That way you would end up with a working cache. Sometime last year I spent a few hours trying to hack that together, but I couldn't figure out how to combine pexes in a sane way. If you know of a way to do that, please share :-)
As for grpc, the best solution is likely to be grpc/grpc#8079 if they ever manage to implement it.
I have a disgusting hack to pex to make the manylinux1
wheels work. See my comment on https://github.com/pantsbuild/pex/issues/281 . With this, I have a very hacky tool that generates the WORKSPACE rules to depend on a set of requirements.txt
dependencies. I'm hoping to release these tools as open source once I get the whole thing actually working. However, its also possible that the work on https://github.com/bazelbuild/rules_python will "catch up" and actually work for real applications and this may become unnecessary.
For now, (also waiting for rules_python to catch up and actually support dependencies and python3 at the same time) I've written a script that reads in a single requirements.txt file, creates a venv and builds all the dependencies locally which avoids using any of the cached manylinux1 wheels and makes wheels that work on the linux host we are targeting.
Then we upload all those built wheels to our google cloud storage and generate a bunch of entries for the WORKSPACE to reference them. Finally, it generates some entries for a BUILD file that just creates file groups for each of the top level dependencies that can be passed in as wheels to the pex_binary command. It is ugly but it works.
It could definitely stand to be turned into a proper library but I'm hesitant to do that when it seems like the canonical python rules are slowly getting to be usable.
Amazing this sounds basically identical to what I am doing, with the minor manylinux1
exception. So far it seems to be working, although the time to create pexes with lots of dependencies is understandably huge, so I may end up needing to make some changes to support PEX_PATH
linking, at least for tests.
I don't know if the pex rule is re-downloading it every time or what but it makes development pretty impossible. Is this expected? Shouldn't it be cached in some way?
here's what my rule looks like:
If I make a change to update_dns.py and then re-run the script with bazel it takes 70 seconds or so. Reruns without any changes to the file are quick.