golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.86k stars 17.65k forks source link

x/build/cmd/coordinator: write_go_bootstrap_tar failing with 404 on some arm builders #69038

Closed millerresearch closed 1 month ago

millerresearch commented 2 months ago

Go version

gotip

Output of go env in your module/workspace:

n/a

What did you do?

Observed https://farmer.golang.org

What did you see happen?

[plan9-arm](https://github.com/golang/go/wiki/DashboardBuilders) rev [10ed134a](https://go-review.googlesource.com/#/q/10ed134afe1319403a9a6a8b6bb798f29e5a4d5e); [running](https://farmer.golang.org/temporarylogs?name=plan9-arm&rev=10ed134afe1319403a9a6a8b6bb798f29e5a4d5e&st=0xc016f27c00); http://pi4g reverse peer pi4g/88.97.27.83:60662 for host type host-plan9-arm-0intro, 12h30m43s ago
...
  2024-08-22T21:03:27Z finish_get_source after 0s; go@10ed134afe1319403a9a6a8b6bb798f29e5a4d5e
  2024-08-22T21:03:27Z write_go_src_tar 
  2024-08-22T21:03:27Z finish_write_go_bootstrap_tar after 676.1ms; err=500 Internal Server Error; body: writetgz: fetching provided URL "https://go.dev/dl/go1.22.6.plan9-arm-7.tar.gz": 404 Not Found

 +45017.1s (now)
[plan9-arm](https://github.com/golang/go/wiki/DashboardBuilders) rev [d2879efd](https://go-review.googlesource.com/#/q/d2879efd0227df32d6aeee1be58c325b477f22d4); [running](https://farmer.golang.org/temporarylogs?name=plan9-arm&rev=d2879efd0227df32d6aeee1be58c325b477f22d4&st=0xc0396f9500); http://pi4n reverse peer pi4n/88.97.27.83:51435 for host type host-plan9-arm-0intro, 12h29m58s ago
...
  2024-08-22T21:04:15Z finish_get_source after 0s; go@d2879efd0227df32d6aeee1be58c325b477f22d4
  2024-08-22T21:04:15Z write_go_src_tar 
  2024-08-22T21:04:15Z finish_write_go_bootstrap_tar after 615.5ms; err=500 Internal Server Error; body: writetgz: fetching provided URL "https://go.dev/dl/go1.22.6.plan9-arm-7.tar.gz": 404 Not Found

 +44968.7s (now)
[linux-mips-rtrk](https://github.com/golang/go/wiki/DashboardBuilders) rev [ea08952a](https://go-review.googlesource.com/#/q/ea08952aa2db17ce4c14d9f9cb0fab03380073a0); [running](https://farmer.golang.org/temporarylogs?name=linux-mips-rtrk&rev=ea08952aa2db17ce4c14d9f9cb0fab03380073a0&st=0xc02cc1d6c0); http://host-linux-mips64-rtrk reverse peer host-linux-mips64-rtrk/82.117.214.122:43586 for host type host-linux-mips64-rtrk, 8h39m58s ago
...
  2024-08-23T09:33:38Z run_test:go_test:cmd/link/internal/benchmark host-linux-mips64-rtrk
  2024-08-23T09:33:42Z finish_run_test:go_test:cmd/link/internal/benchmark after 3.53s; host-linux-mips64-rtrk
  2024-08-23T09:33:42Z run_test:go_test:cmd/link/internal/ld host-linux-mips64-rtrk
   +2.0s (now)
[linux-mipsle-rtrk](https://github.com/golang/go/wiki/DashboardBuilders) rev [ea08952a](https://go-review.googlesource.com/#/q/ea08952aa2db17ce4c14d9f9cb0fab03380073a0) (sub-repo net rev [4542a426](https://go-review.googlesource.com/#/q/4542a42604cd159f1adb93c58368079ae37b3bf6)); [running](https://farmer.golang.org/temporarylogs?name=linux-mipsle-rtrk&rev=ea08952aa2db17ce4c14d9f9cb0fab03380073a0&st=0xc040599180&subName=net&subRev=4542a42604cd159f1adb93c58368079ae37b3bf6); http://host-linux-mips64le-rtrk reverse peer host-linux-mips64le-rtrk/82.117.214.122:40052 for host type host-linux-mips64le-rtrk, 8h39m53s ago
...
  2024-08-23T09:28:44Z listing_subrepo_modules net
  2024-08-23T09:28:45Z finish_listing_subrepo_modules after 384.7ms; net
  2024-08-23T09:28:45Z running_subrepo_tests net
 +299.4s (now)
[netbsd-arm-bsiegert](https://github.com/golang/go/wiki/DashboardBuilders) rev [b2f3a427](https://go-review.googlesource.com/#/q/b2f3a427dd554874eab570d03297468d22f903b6); [running](https://farmer.golang.org/temporarylogs?name=netbsd-arm-bsiegert&rev=b2f3a427dd554874eab570d03297468d22f903b6&st=0xc042caefc0); http://ebi.bentsukun.ch reverse peer ebi.bentsukun.ch/81.221.220.50:54867 for host type host-netbsd-arm-bsiegert, 8m58s ago
...
  2024-08-23T09:32:50Z finish_get_source after 0s; go@b2f3a427dd554874eab570d03297468d22f903b6
  2024-08-23T09:32:50Z write_go_src_tar 
  2024-08-23T09:32:50Z finish_write_go_bootstrap_tar after 627.7ms; err=500 Internal Server Error; body: writetgz: fetching provided URL "https://go.dev/dl/go1.22.6.netbsd-arm-7.tar.gz": 404 Not Found

  +53.8s (now)
[openbsd-arm-jsing](https://github.com/golang/go/wiki/DashboardBuilders) rev [b2f3a427](https://go-review.googlesource.com/#/q/b2f3a427dd554874eab570d03297468d22f903b6) (sub-repo net rev [4542a426](https://go-review.googlesource.com/#/q/4542a42604cd159f1adb93c58368079ae37b3bf6)); [running](https://farmer.golang.org/temporarylogs?name=openbsd-arm-jsing&rev=b2f3a427dd554874eab570d03297468d22f903b6&st=0xc056ccb500&subName=net&subRev=4542a42604cd159f1adb93c58368079ae37b3bf6); http://gobuilder-arm.sing.id.au reverse peer gobuilder-arm.sing.id.au/206.83.113.114:32633 for host type host-openbsd-arm-joelsing, 8m13s ago
...
  2024-08-23T09:32:55Z finish_get_source after 0s; go@b2f3a427dd554874eab570d03297468d22f903b6
  2024-08-23T09:32:55Z write_go_src_tar 
  2024-08-23T09:32:56Z finish_write_go_bootstrap_tar after 1.24s; err=500 Internal Server Error; body: writetgz: fetching provided URL "https://go.dev/dl/go1.22.6.openbsd-arm-7.tar.gz": 404 Not Found

  +48.1s (now)

What did you expect to see?

It appears the build script is trying to fetch boostrap archives of the form go1.22.6.GOOS-arm-7.tar.gz when only go1.22.6.GOOS-arm.tar.gz exists (ie without the -7).

gabyhelp commented 2 months ago

Related Issues and Documentation

(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)

cherrymui commented 2 months ago

cc @golang/release @dmitshur

gopherbot commented 2 months ago

Change https://go.dev/cl/608076 mentions this issue: dashboard: leave out -7 suffix from go.dev/dl/ bootstrap URLs

dmitshur commented 2 months ago

Thanks for reporting. This is a mistake in CL 520901, which didn't take into account the "-5" or "-7" suffixes in host config's HostArch. For GOOS != linux, the GOARCH = arm go.dev/dl/ archives are built with the cross-compilation default of GOARM=7, which is what the builders mentioned above are looking to download. Sent CL 608076.

Please note that the legacy build infrastructure isn't intended to be fully supported beyond the May 17, 2024 date (golang-dev thread), so we can only fix minor issues similar to this in order to help finish ongoing builder migrations to LUCI and give them more time. Thanks for for your work on migrating the remaining builders to LUCI.

millerresearch commented 2 months ago

Please note that the legacy build infrastructure isn't intended to be fully supported beyond the May 17, 2024 date (golang-dev thread), so we can only fix minor issues similar to this in order to help finish ongoing builder migrations to LUCI and give them more time. Thanks for for your work on migrating the remaining builders to LUCI.

The plan9-arm LUCI builder is pretty stable now, with no repeatable failures and no more intermittent flakes than the legacy version. Do I need to do something formal to switch off the legacy plan9-arm builder or just stop running it?

millerresearch commented 2 months ago

Looks like the fix wasn't sufficient. All plan9-arm builds are now failing like this:

Build log:
plan9-arm at 1fd8557249a9e8c04fbe7490483443ccc35dea50

:: Running /boot/workdir/go/src/make.rc with args ["/boot/workdir/go/src/make.rc" "-force"] and env ["home=/usr/glenda" "path=/boot/workdir/go1.4/go/bin\x00.\x00/bin" "type=host-plan9-arm-0intro" "GOARM=7" "GO_BUILD_KEY_DELETE_AFTER_READ=false" "GOTOOLCHAIN=local" "status=" "GO_TEST_TIMEOUT_SCALE=3" "fs=aoe" "GOCACHE=/boot/cache" "GOROOT_BOOTSTRAP=/boot/workdir/go1.4" "sysname=pi4n" "workdir=/boot/workdir" "objtype=arm" "*=aoe" "WORKDIR=/boot/workdir" "GO_BUILDER_NAME=plan9-arm" "GO_TEST_TIMEOUT_SCALE=3" "GOBIN=" "GOROOT_BOOTSTRAP="] in dir /boot/workdir/go/src

Building Go cmd/dist using . (go1.20 plan9/arm)
Building Go toolchain1 using /go1.4.
go tool dist: FAILED: /go1.4/bin/go install -tags=math_big_pure_go compiler_bootstrap purego bootstrap/cmd/...: fork/exec /go1.4/bin/go: '/go1.4' file does not exist

Error: build failed: make script failed: exit status: 'make.rc 199: dist 697: 2'

I don't know why it's trying to load the bootstrap from /go1.4 instead of /boot/workdir/go1.4

dmitshur commented 2 months ago

The plan9-arm LUCI builder is pretty stable now, with no repeatable failures and no more intermittent flakes than the legacy version. Do I need to do something formal to switch off the legacy plan9-arm builder or just stop running it?

Indeed, that is great! Both plan9/arm and plan/386 LUCI builders look good to remove their known issue and consider them added. I sent CL 608155 to do that, and CL 607656 to mark them as migrated. (CC @0intro.)

When the coordinator is redeployed with the latter CL (next week), it'll stop sending work to the legacy plan9/arm builder. But given the equivalent LUCI builder is already providing good signal, I think it's fine for you to stop running it anytime. Thanks very much.


I don't know why it's trying to load the bootstrap from /go1.4 instead of /boot/workdir/go1.4

CL 606835 works around that go.dev/dl/ tarballs, where coordinator gets its go1.22.6 bootstrap toolchain from, have a top-level "go" directory by adding its bin directory to $PATH (or the equivalent path on Plan 9) and clears GOROOT_BOOTSTRAP. It is there in the log, "path=/boot/workdir/go1.4/go/bin\x00.\x00/bin" and "GOROOT_BOOTSTRAP=", but it seems not to work on Plan 9 as it did for other OSes. Maybe there's something different about that logic in make.rc vs make.bash. It seems to be finding some go1.20 bootstrap, but not printing its path, then falls back to a non-existing $home/go1.4 instead.

From my side, it might be easiest to adjust the plan9-arm buildlet to set $GOROOT_BOOTSTRAP to $WORKDIR/go1.4/go. From your side, you could try to place a go1.22.6 plan9/arm bootstrap in /go1.4 or something along those lines. Given the builder is about go go away as mentioned above, I don't think it's worth to do either. However, if there is a problem in the logic of finding a bootstrap in make.rc that you can spot, a fix to make it behave like the .bash version would be useful.

dmitshur commented 1 month ago

I sent CL 608155 to [remove known issue], and CL 607656 to mark them as migrated [which stops coordinator from sending them work].

I haven't heard from you on those CLs, so I'll put them on hold for now. Whenever you're ready to take those next steps, please let me know and I'll rebase & submit them.

dmitshur commented 1 month ago

Closing this again since the original problem with write_go_bootstrap_tar failing with 404 is resolved, the only currently connected legacy plan9 builder (plan9-arm) is working okay with a go1.22.6 bootstrap, so I don't think there's more left to do here.