pantsbuild / pants

The Pants Build System
https://www.pantsbuild.org
Apache License 2.0
3.27k stars 627 forks source link

release pants as a pex #4896

Closed kwlzn closed 6 years ago

kwlzn commented 6 years ago

now that we're adequately entangled in platform-specific deps that are often times resolving (or failing to resolve.. or build..) against pypi, it would probably make sense for us to go ahead and start building and releasing a binary pex of pants along side (or in place of) sdists. this should eliminate most customer issues related to resolves and/or resolve-time builds of native deps at pants install/bootstrap time and ensure a consistent and tested shared set of backing wheels for running pants in most cases.

at Twitter, we deploy pants only via pex using a custom build and release process. having a formally released OSS pex that we can directly consume will help us stay closer to shared releases vs direct sha consumption.

kwlzn commented 6 years ago

core problems to solve here:

jsirois commented 6 years ago

can we get a travis instance of OSX on a platform old enough ...

I think likely not, but it seems worth it to me to pay up again for a https://macstadium.com instance at the worst. They offer OS choices as old as 10.7 and even VMWare, in which case we could run OSX vms (not sure of licensing costs for OSX versions) and generate binaries per-OSX version if we're super paranoid about compat).

kwlzn commented 6 years ago

sounds great.

jsirois commented 6 years ago

can we get a travis instance of OSX on a platform old enough ...

I think likely not ...

We can using osx_image: xcode6.4 which nets us osx 10.10. I'm trying to flip to this in #4914

kwlzn commented 6 years ago

nice

wisechengyi commented 6 years ago

Could you clarify

  1. Is the plan to release just pantsbuild.pants as pex, or all the other packages together in a single pex?
  2. Where are the pex going to be stored?
  3. Will this change the existing publishing story to pypi?
kwlzn commented 6 years ago

1) up for discussion, but I'd assume just the core pants package as pex with the others bootstrapped as plugins via the existing plugin resolver. 2) github releases let you directly attach files, so probably there (as we do with pex: https://github.com/pantsbuild/pex/releases/tag/v1.2.12) 3) shouldn't, no.

jsirois commented 6 years ago

Now that #4906 is in - we release pants via wheels, including linux/mac specific ones for the main pantsbuild.pants distribution - the prerequisites for releasing a pex are all met afaict.

stuhood commented 6 years ago

I'm going to sketch out how to do this today, and hopefully get started on it.

stuhood commented 6 years ago

My proposal:

  1. The linux and osx binary builder shards will begin pushing SHA-keyed and SHA-versioned platform specific whls to a "well known" (but private, ie: not guaranteed to be stable) location in our (existing?) S3 bucket.
  2. A script named pex_from_whls.sh will be added that given a SHA (defaulting to HEAD) and a list of platforms (defaulting to both linux and osx), consumes the wheels to build a pex from src/python/pants/bin:pants by setting:
    • --python-setup-platforms=${platforms}
    • --python-repos-repos=${s3_whls_path}

For now, pex_from_whls.sh will only be run manually by users who need pexes. In future PR(s), we could begin running it either as a new travis Build Stage, or as part of release.sh.

The "list of platforms" option for pex_from_whls.sh is an important component for turnaround time, as it will not always be necessary to wait for the osx binary builder shard to complete if a pex does not need to be cross-platform (say, to test it in a company's internal linux-only CI).

I initially thought about building two single-platform pexes (which is significantly easier), but the downside of that approach is that whatever bootstrap script fetches the pex would need to do platform detection, which pex already implements for multi-platform pexes.


Will start implementation tomorrow... would appreciate any feedback.

kwlzn commented 6 years ago

1 sounds good, but we'll likely have to resort to setup.py mod hacks (as we do today internally) to achieve SHA-versioned (but I think it would be worthwhile to have this annotation on the .whl inside the pex). we might be able to improve this with a setuptools plugin, or it may not be worth it.

for #2, a shell script seems fine for now. eventually, I wonder if we couldn't get away with doing this purely in pants. --python-setup-platforms could be decomposed as 3 separate BUILD targets (e.g. src/python/pants/bin:{pants_osx,pants_linux,pants_multi}). and --python-repos-repos could potentially be done as a simple custom target type (pants_binary?) that wrapped PythonBinary, parameterized the current git sha as python_binary(repositories=['http://.../{}/'.format(sha)]) and injected likewise SHA-versioned deps, etc.

it would probably also make sense for the same binary builder shards generating pants' own bdists to also do a pip wheel resolve against pants' transitive deps and stash those in the S3 bucket also (i.e. automating the cheeseshop model).. this would 1) ensure that the same machine builds the native_engine binary as well as any backing 3rdparty deps for alignment and 2) enable painless multi-platform resolves (from any runtime OS) for deps like psutil that don't provide bdists on pypi etc. otherwise, afaict pex_from_whls.sh will never be able to create a singular multi-platform pex since it can only ever run on one platform at a time.

stuhood commented 6 years ago

to also do a pip wheel resolve against pants' transitive deps and stash those in the S3 bucket also (i.e. automating the cheeseshop model)

Yea, this will be necessary. There are source-only deps on PyPI which you wouldn't be able to translate otherwise, afaik.

illicitonion commented 6 years ago

It's worth noting that a release build of the native engine is large. The Linux .so is 65M, and the Mac .so is 15M. They zip down ok, so that a pex is only about 35M, but 35M is still very large. (I hadn't realised until now how different they were; I'm curious, but not quite curious enough to find out...)

It's likely that when https://github.com/rust-lang/rust/issues/36342 is resolved and we can be building a cdylib not a hacky dylib, the linker can do a lot more stripping and these will get smaller, but platform-specific pexes would have a non-trivial size improvement (which maps to a non-trivial speed improvement on first run).

stuhood commented 6 years ago

A few notes after additional reading:

So, adjustments to the plan:

  1. Reuse the travis whl building path, but adjust it to:
    • namespace the whls under the current SHA, to avoid collisions [0]
    • push SHA-suffixed releases for branches
    • push un-suffixed releases for tags
  2. To avoid the release race condition above, I'll adjust the release workflow to move the tag push much earlier in the process. Because it will be a tag, CI will build un-suffixed whls for the version, which the release script can consume.

[0] Both a suffixed and un-suffixed version will be built for a particular SHA if it both exists in a branch and is tagged, or is tagged multiple times, but these collisions should be harmless because the packages are identical.

stuhood commented 6 years ago

As mumbled to myself on #5118, I've been exploring @illicitonion 's suggestion to always build suffixed releases in travis, and to "re-version" them only when we want to release them. While a bit crazy, it was perhaps not too crazy: I've implemented it here #5145.

So, the plan of record is now to:

  1. Always build suffixed pants wheels
  2. Make no changes to how releases are tagged
  3. Re-version the suffixed wheels to stable wheels after fetching them from S3 and before testing them and uploading them to pypi.
stuhood commented 6 years ago

Alright... the first 80% of this has landed via #5145 and #5118. I'll run the release tomorrow in order to figure out how I broke everything.

The last 20% (possibly also accomplishable tomorrow?) will be to take all of the frozen input wheels and build pexes. There is an open question of where to publish them (github? s3?), and where to link to them from (the docsite? pypi?).

stuhood commented 6 years ago

Got distracted by various things and didn't get to actually building pexes. But got the release out, and a few fixes to the release script are here: #5152.

stuhood commented 6 years ago

Next step is out in #5159

stuhood commented 6 years ago

Pretty deep in a rabbit hole on this one, but getting close! Now updating g++ in the docker image to get gRPC compiling.

Note to self: http://linux.web.cern.ch/linux/devtoolset/#install

stuhood commented 6 years ago

Landed #5167 .. next up, tagging that to docker hub, and swapping over the travis_ci image to start FROM it. cc @illicitonion , @jsirois

stuhood commented 6 years ago

Validated that on SHAs after 96264d36a82aa2da7b2cc533d71c615e50c4fefe, running:

build-support/bin/release.sh -p

produces a crossplatform pex in dist that works on centos6 and on OSX. Huzzah.

stuhood commented 6 years ago

Alright: time to finalize this ticket I think (and open another one for followup).

I'd like to propose that for now we defer publishing pexes as part of the release process (and move that into a separate ticket), and instead make it very easy for consumers to manually request a pex for the prebuilt wheels. The reality of our (recent) usage is that we need some number of cherry-picks atop any internal release, and stable releases happen infrequently enough that we can't rely on release binaries.

So I'd propose to close this ticket by:

upstream (here): 1) adding documentation advertising the ability for anyone to request contributor access 2) giving contributors the ability to push branches/tags to origin to trigger whl builds (and thus treat travis as a community resource) 3) document the process of building a pex by manually invoking release.sh -p (as mentioned above: manual for now. will open a followup for publishing pexes as part of the release process)

internally: 1) Refer team members to the upstream docs for kicking off builds as contributors 2) build a pex for the relevant sha and execute any internal release steps


Thoughts? The "contributors can push branches" bit in particular seems potentially controversial.

jsirois commented 6 years ago

I think the odd thing here is an OSS project where committers can build special public binaries, whereas plain old users cannot. It's true committers are by definition privileged in general in any OSS project, but this particular privilege seems a bit confusing. We then have public binaries for anyone to use that are a not easily identifiable mix of commits and components. All this said, Twitter contributes a huge amount to the project, so - constructive ideas:

jsirois commented 6 years ago

And, just got the contributors vs committers distinction, but the confusion objection still stands - we allow anyone, via contributor access request, to publish, but we still have a mass of unidentifiable public releases as a result.

kwlzn commented 6 years ago

So I'd propose to close this ticket by: upstream (here):

adding documentation advertising the ability for anyone to request contributor access giving contributors the ability to push branches/tags to origin to trigger whl builds (and thus treat travis as a community resource) document the process of building a pex by manually invoking release.sh -p (as mentioned above: manual for now. will open a followup for publishing pexes as part of the release process) internally:

Refer team members to the upstream docs for kicking off builds as contributors build a pex for the relevant sha and execute any internal release steps

this sgtm - and at the very least seems like a perfectly reasonable experiment that we can always tweak later as needed.

..we still have a mass of unidentifiable public releases as a result.

could we solve for this via better annotation of the custom vs mainline releases in either tag naming or posted descriptions on the github release landing page? or maybe we could use this flag for all custom releases:

image

Why not a twitter fork ...

we've long had a custom solution to this - the goal here is to unfork as much as possible, so that all consumers can benefit from and share the same shared release mechanics, build platforms, etc.

jsirois commented 6 years ago

My proposal would share all this (shared release mechanics, build platforms, etc), just not the public branches/tags, travis time and storage.

The other downside to enabling this is we'd then be effectively promoting fragmentation. The easier it is to create a custom binary using the main project, the easier it is to stay mis-aligned - ie cherry pick vs just fix the mainline or work off of weekly releases.

I'm not hugely opposed, this just seems decidedly odd and despite all pretensions, a feature being enabled for Twitter only.

stuhood commented 6 years ago

Got great feedback here, and in the slack #releases room. Will mail about a completely different proposal when I get some more time tomorrow.

stuhood commented 6 years ago

I've moved the ball a lot on this one, and we now have everything in place to begin publishing a cross-platform pex as part of releases. But I need to context switch away to continue to clean up our internal release process, so unassigning this.

jsirois commented 6 years ago

The 1st release worked: https://github.com/pantsbuild/pants/releases/tag/release_1.9.0.dev0 But there are warts. Arguably, the file name should just be pants.pex since the github release contains the version info. Also the pex reports an unexpected version for the end user:

$ curl -LO https://github.com/pantsbuild/pants/releases/download/release_1.9.0.dev0/pants.1.9.0.dev0.pex && chmod +x pants.1.9.0.dev0.pex && ./pants.1.9.0.dev0.pex -V
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   609    0   609    0     0    904      0 --:--:-- --:--:-- --:--:--   904
100 28.4M  100 28.4M    0     0  4050k      0  0:00:07  0:00:07 --:--:-- 6529k
1.9.0.dev0+aad4b7be

File #6026 to address these items.