Closed fnichol closed 6 years ago
Okay, finally through the list! Now I'm going to rebase against master and drive out any changes that have been merged in the last 6 weeks or so. Then another full build run of these in a stage1 to make sure we're all really good.
Rebase is good, repushed and going to update Habitat versions and rebuild the set in a stage1 Studio.
Rebuilding the base packges above in a Stage1 Studio looked good, except for these Plans which download from SourceForge (currently experiencing a large service outage):
In the process I found a missing dependency of core/bats
on core/hab-plan-build
which is now in build-base-plans.sh
.
The timings for the set were broken between the Plans that stopped (again, SourceForge-related) which I resolved by downloading the source on another system, putting the source in the Studio and continuing the program. Here are the raw timings:
Start -> pkg-config:
real 269m10.666s
user 524m8.489s
sys 45m11.665s
ncurses -> shadow
\real 2m0.602s
user 1m35.235s
sys 0m20.543s
psmisc -> psmic
real 0m5.838s
user 0m4.683s
sys 0m1.003s
procps-ng -> gdbm
real 16m31.664s
user 6m53.641s
sys 2m16.246s
expat -> findutils
real 19m55.995s
user 17m50.555s
sys 3m55.133s
xz -> util-linux
real 10m6.373s
user 9m45.170s
sys 5m58.384s
tcl -> tcl
real 1m50.839s
user 1m43.922s
sys 0m10.733s
expect -> wget
real 10m42.832s
user 3m47.270s
sys 0m36.734s
unzip -> libarchive-musl
real 8m45.865s
user 9m1.078s
sys 1m23.558s
(Note: why did libarchive-musl stop that loop? dunno)
rust -> hab
real 6m7.727s
user 29m46.053s
sys 0m30.719s
bats -> libbsd
real 1m36.106s
user 1m26.842s
sys 0m9.145s
clens -> hab-studio
real 0m11.036s
user 0m8.200s
sys 0m1.057s
This might look long, but keep in mind that the DO_CHECK
environment variable was set, meaning that all the do_check()
build phases were triggered and were successful.
I'm going to run one more test run against a current master checkout of habitat-sh/habitat
as the build program was not 100% up-to-date from the last attempt and was checked out to Habitat circa ~0.52.0.
Still left to do is update Plans that are failing the PR linting to ensure that they are brought up to date with our standards, so expect at least a few more rebase/pushes.
The rebuild in a stage1 Studio went well last week/weekend and went almost all the way through without stopping. In order to work around SourceForge outage issues, I dropped the source tarballs for the packages above in the src cache before running the build (thus skipping those downloads but still verifying the sources).
Now for passing the linting on some plans that haven't been updated since I wrote some of the originals.
Now all the plans are passing linting and appear (hopefully) much more consistent with the other base plans. The branch has been rebased within itself so that the commit order is a combination of the plan build order and which changes were made first (i.e. version bumps, linting updates, etc). Note that some have 2 or even 3 version bumps which I attempted to maintain for historical and git-bisecting reasons, as well as to preserve author history.
I'm going to rebase this branch again against current master to make sure it still is merge-clean.
Rebase went okay. Will try (most likely) one last stage1 build to ensure that nothing regressed after all the linting fixes. I did manage to break bzip2 for a few hours from a lint fix, so you never know…
Now I'm looking at a final freshening of several foundational packages, namely glibc, binutils, and gcc. The first thing I've found is that glibc 2.27 now requires bison to build which requires building a new stage1 tarball from using the Linux From Scratch project as before (the current version does not have bison and therefore has insufficient dependencies). Sometimes one small version bump involves a lot of work--this is really hard to pre-determine.
Here are the remaining Plan updates that I skipped last pass as I suspected they would be larger and more focused. The above comment alluded to updating the stage1 tarball which is directly related to the Glibc/Binutils/GCC gang.
Looking good with these updates, now will rebase these updates into the branch and finally rebase against current master.
Rebasing complete, now running a last stage1 regression build…
I found another failing test in procps-ng
and a glibc-2.27 fix required in make
. The branch is rebased and updated as a result of that last stage1 run.
As I'm using an updated stage1
tarball that only I had to date, I published it (i.e. uploaded it to our s3 bucket) and updated the stage1
logic in the Studio codebase so that others will be able to replicate this work in habitat-sh/habitat#4766 (should ship in the next Habitat release most likely today).
Well, it's been a week. The stage1 builds have gone great, but I was unable to use the stage1 artifacts to enter a new default
-type Studio without the need of an intermediate Depot/Builder API.
Instead I managed to refactor and update the install logic used by hab pkg install
so that I could use a local core/hab-backline
package locally that wasn't yet uploaded (instructions to follow), which was the basis of habitat-sh/habitat#4771.
Once I was able to start a "stage2" build, the binutils
package failed. After some digging in there, it turns out we needed to empty up the LDFLAGS
environment variable a bit, just like the tweaks that were needed for C*FLAGS
variables (rebased branch inbound soon).
Running the full base set in a second stage Studio with full test suite should be more than enough to prove out any issues or differences related to stage1
vs. default
building (this mostly related to pkg_build_deps()
differences).
Also related to this work is an update to the Bootstrapping Habitat docs page which will form the basis of the testing steps another person could take to verify this work. As I was revisiting this workflow, I found that more and more environment variables were needed inside and outside the Studios to correctly prepare the building set, so I set about folding some of these steps into short scripts that the bootstrapping instructions can run. I'll be including those in this PR as well for completeness.
The first run through a stage2 build was finally successful. I needed further updates and fixes to binutils, procps-ng, and bc and these are now in the current branch which is rebased against current master.
Next up is a full test run of a stage2 build to ensure that nothing else is missed. As this turns a couple of hours task into a six-hour task, I thought it prudent to go wide and shallow first before going deep and wide.
I'm also in a place where I should be able to test a build in a current default Studio using our existing software to build against. I'm still not sure what to expect here, but would like to know if this is possible. If it is, Builder can help us with some of the base plans building. If not, then we're back to a local Studio build of the set which will form the basis of a new package set. Anyway, updates to follow.
Okay, the stage2 build with tests is solid. Hopefully one last rebase of this branch and it's review-ready.
The bummer news is that when I tried a build using the default Studio, it quickly failed to build glibc
. I suspect this is because we have pretty old software trying to build the most modern equivalents and there are edge cases that would need to be considered.
As a result, this means we'll most likely need to build these base plans in a Studio out-of-band from Builder, upload them and build from there.
Hey guess what? I think we're finally ready to move this work forward!
Okay, so what if you wanted to try and build this base set and verify that it works? Or to put things another way: how did I build and test this set?
Related to this work is an update to the Internals/Bootstrapping Habitat page on our docs site. The PR with this update is at habitat-sh/habitat#4829 and you can take an early look on our acceptance website (please mind the not-correct SSL certificate warning). I'm not 100% happy with the rendering of some of the code snippets so if you're copy/pasting the steps to follow, it might be safer to do so from the git source rather than the docs page itself.
I'll add exactly which steps to follow and which to change in this issue in a minute…we're merging some work in habitat-sh/habitat
that's related and may affect what you clone and checkout.
The setup, steps, explanations, etc. are going to shortly be at https://www.habitat.sh/docs/internals/#bootstrap-internals, but until then, we're going to use the version in the PR branch for habitat-sh/habitat#4829 which is:
You'll most likely want to get a cloud instance with good compute and reasonable storage otherwise it takes a lot longer to build.
Start with the Part III: Preparing to build
section. However, since we're testing this branch and we need a fix from the habitat-sh/habitat
repo, you can run this instead:
$ mkdir habitat-sh
$ cd habitat-sh
$ git clone https://github.com/habitat-sh/habitat.git
$ (cd habitat && git checkout fnichol/studio-new-cleanup)
$ git clone https://github.com/habitat-sh/core-plans.git
$ (cd core-plans && git checkout fnichol/teh-futur)
If you're a core maintainer with a legit. core
origin secret key, then you can install it on your host (you can use hab origin key import
and paste the secret key in--make sure you're the non-root user). Otherwise you can generate a throwaway core
key like the docs page suggests--if you aren't going to upload these to a real Builder API then it shouldn't matter either way.
The Part VI: Remaining packages in world
section is still work-in-progress, where you're going to build all other non-base packages if you want to see that--just know that we're talking 14+ hours to do this in serial. Having said that, we'd likely use some form of this to see which Plans need an update or fix. In fact, I'm going to resume this myself and try to submit standalone PRs in core-plans to fix these as they come up (if one of these PRs works for current packages and new packages we can merge them earlier with super low risk).
It's possible that you might run low or out of disk, so it's handy to know that we're keeping all the Studio's root filesystems around with these instructions. If you need to reclaim space, then head to the Part VIII: Cleaning up
section and kill some Studios!
Got an error when running ./core-plans/bin/bootstrap/stage1-build-base-plans.sh
Here is the error:
cp/cp-lang.o c-family/stub-objc.o cp/call.o cp/decl.o cp/expr.o cp/pt.o cp/typeck2.o cp/class.o cp/decl2.o cp/error.o cp/lex.o cp/parser.o cp/ptree.o cp/rtti.o cp/typeck.o cp/cvt.o cp/except.o cp/friend.o cp/init.o cp/method.o cp/search.o cp/semantics.o cp/tree.o cp/repo.o cp/dump.o cp/optimize.o cp/mangle.o cp/cp-objcp-common.o cp/name-lookup.o cp/cxx-pretty-print.o cp/cp-cilkplus.o cp/cp-gimplify.o cp/cp-array-notation.o cp/lambda.o cp/vtable-class-hierarchy.o cp/constexpr.o cp/cp-ubsan.o cp/constraint.o cp/logic.o attribs.o incpath.o c-family/c-common.o c-family/c-cppbuiltin.o c-family/c-dump.o c-family/c-format.o c-family/c-gimplify.o c-family/c-indentation.o c-family/c-lex.o c-family/c-omp.o c-family/c-opts.o c-family/c-pch.o c-family/c-ppoutput.o c-family/c-pragma.o c-family/c-pretty-print.o c-family/c-semantics.o c-family/c-ada-spec.o c-family/c-cilkplus.o c-family/array-notation-common.o c-family/cilk.o c-family/c-ubsan.o c-family/c-attribs.o c-family/c-warn.o i386-c.o glibc-c.o cc1plus-checksum.o libbackend.a main.o libcommon-target.a libcommon.a ../libcpp/libcpp.a ../libdecnumber/libdecnumber.a libcommon.a ../libcpp/libcpp.a ../libbacktrace/.libs/libbacktrace.a ../libiberty/libiberty.a ../libdecnumber/libdecnumber.a -L/hab/pkgs/core/gmp/6.1.2/20180402211659/lib -L/hab/pkgs/core/mpfr/4.0.1/20180402211719/lib -L/hab/pkgs/core/libmpc/1.1.0/20180402211736/lib -lmpc -lmpfr -lgmp -rdynamic -ldl -lz
collect2: error: ld returned 1 exit status
make[3]: *** [../../gcc-7.3.0/gcc/c/Make-lang.in:85: cc1] Error 1
make[3]: *** Waiting for unfinished jobs....
collect2: error: ld returned 1 exit status
make[3]: *** [../../gcc-7.3.0/gcc/lto/Make-lang.in:81: lto1] Error 1
rm gfortran.pod gcc.pod
make[3]: Leaving directory '/hab/cache/src/gcc-build/gcc'
make[2]: *** [Makefile:4706: all-stageprofile-gcc] Error 2
make[2]: Leaving directory '/hab/cache/src/gcc-build'
make[1]: *** [Makefile:23870: stageprofile-bubble] Error 2
make[1]: Leaving directory '/hab/cache/src/gcc-build'
make: *** [Makefile:24007: profiledbootstrap] Error 2
gcc: Build time: 23m58s
gcc: Exiting on error
build-plans.sh run time: 44m26s
Exiting on error
Okay, after some sleuthing and trying to reproduce this, I found that…I ran out of disk space on my compute instance. I think this'll be a good lesson: you want to have a reasonable amount of disk to perform this work--at the moment we're both trying with ~120GB root disk and will report back.
I've successfully built stage1 and stage2 without problems. I'm currently building stage3 (world), it started fine, and I'm confident it'll get to the end ;)
The documentation steps were really clear, I didn't dug into the scripts, just executed them, it was a nice experience doing this. Only one small glitch found in the doc, added a comment on the PR.
LGTM ;) :tada:
Edit: Actually, gdb failed with a compilation error:
location.c:527:19: error: ISO C++ forbids comparison between pointer and integer [-fpermissive]
|| *argp == '\0'
Looks like @fnichol's studio work was merged into the master branch of Habitat, here are the revised instructions for get into the stage-1 studio:
1) Spin up c4.4xlarge instance on AWS with 120 GB disk
2) Start tmux
3) Create new tmux session
$ tmux new -s base_plans
4) Install Habitat
$ curl https://raw.githubusercontent.com/habitat-sh/habitat/master/components/hab/install.sh | sudo bash
Import current secret core key with
$ hab origin key import
$ mkdir habitat-sh
$ cd habitat-sh
$ git clone https://github.com/habitat-sh/habitat.git
$ git clone https://github.com/habitat-sh/core-plans.git
$ (cd core-plans && git checkout fnichol/teh-futur)
$ ./core-plans/bin/bootstrap/stage1-studio.sh enter
Just rebased the branch against current master.
Now that #1210 I'm going to close this out.
PARTY TIME!
The so-called "base plans" that I'm referring to is in the
bin/build-base-plans.sh
script. Any relevant issues and pull requests are getting placed into the Base Packages Refresh 2018.03 milestone.