golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
122.81k stars 17.51k forks source link

x/build/env/linux-mips: we have no Linux MIPS builders #31217

Closed bradfitz closed 4 years ago

bradfitz commented 5 years ago

All four of our MIPS variants have lost their builders and the email address of the old owner now bounces. (changed jobs, presumably)

So, we need to find a new owner, or remove the port per policy (https://github.com/golang/go/wiki/PortingPolicy#removing-a-port)

/cc @randall77 @cherrymui @ianlancetaylor @dmitshur

mengzhuo commented 4 years ago

@dmitshur

Sorry about pinging wrong Dmitri :)

It's still not working.

  2019-09-17T14:02:15Z finish_write_snapshot_to_gcs after 3m30.9s; err=local error: tls: bad record MAC
  2019-09-17T14:02:15Z finish_make_and_test after 9m44.8s; err=writeSnapshot: local error: tls: bad record MAC
dmitshur commented 4 years ago

Indeed, it isn't working for your builder even after coordinator has been deployed with TLS 1.2.

I noticed that it is working for other builder types. For example, I watched this trybot run, and its log on linux-amd64 builder type said:

[...]
2019-09-17T13:49:05Z write_snapshot_to_gcs 
2019-09-17T13:49:05Z fetch_snapshot_reader_from_buildlet 
2019-09-17T13:49:05Z finish_fetch_snapshot_reader_from_buildlet after 55.2ms
2019-09-17T13:49:12Z finish_write_snapshot_to_gcs after 6.64s
[...]

We need to investigate this further.

FiloSottile commented 4 years ago

bad record MAC means network corruption, and you mentioned having to use a proxy to exit China, so is there a chance a middlebox is tampering with and corrupting the connection?

mengzhuo commented 4 years ago

@FiloSottile It might be. I think I should add skipsnapshot option to my builder.

bradfitz commented 4 years ago

FWIW, the buildlet doesn't connect to GCS directly. It's the coordinator doing that.

All communications between the buildlet and farmer.golang.org are already encrypted. It's the buildlet that connects out to the farmer, and then the farmer initiates the connection to GCS.

The code returning the error is:

        wr := storageClient.Bucket(buildEnv.SnapBucket).Object(st.SnapshotObjectName()).NewWriter(ctx)
        wr.ContentType = "application/octet-stream"
        wr.ACL = append(wr.ACL, storage.ACLRule{Entity: storage.AllUsers, Role: storage.RoleReader})
        if _, err := io.Copy(wr, tgz); err != nil {
                st.logf("failed to write snapshot to GCS: %v", err)
                wr.CloseWithError(err)
                return err
        }

        return wr.Close()

If it's the io.Copy, we don't know for sure whether it's the reading part of the writing part that's failing, but yes... if the GFC is indeed killing connections, it's likely the reading part.

But in that case, this is just going to hurt elsewhere later. All the GFC can do is analysis based on SNI + certs + traffic patterns. This won't be the only bandwidth bursty part of the build. I think we're just going to be playing whack-a-mole & cross our fingers if you keep tweaking things.

Really we need to get a builder on a functioning network.

odeke-em commented 4 years ago

Thanks for ruling out that the TLS versions aren't the issue with the deployment Dmitri and Meng and for chiming in Filippo, I'll defer to Filippo. On Tue, Sep 17, 2019 at 7:41 AM Meng Zhuo notifications@github.com wrote:

@FiloSottile https://github.com/FiloSottile It might be. I think I add skipsnapshot option to my builder.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/golang/go/issues/31217?email_source=notifications&email_token=ABFL3V3PEULOC7SIUXWHDVDQKDT7XA5CNFSM4HDE3LUKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD64X5XI#issuecomment-532250333, or mute the thread https://github.com/notifications/unsubscribe-auth/ABFL3VYKQMW3VV7PRQBR5YDQKDT7XANCNFSM4HDE3LUA .

gopherbot commented 4 years ago

Change https://golang.org/cl/195977 mentions this issue: dashboard: skipsnapshot for linux-mipsle-mengzhuo builder

mengzhuo commented 4 years ago

Thanks everyone. I'm using both polipo (convert socks proxy to http proxy) and ssh ( socks proxy ) to exist firewall and no unusual log shows up.

dmitshur commented 4 years ago

@mengzhuo FYI, I've merged the change that sets SkipSnapshot to true and deployed coordinator (an hour~ ago).

mengzhuo commented 4 years ago

@dmitshur Thanks, there are 3 "ok" builds now. It seems good to me.

cherrymui commented 4 years ago

Thanks @mengzhuo for setting up the builder. I'm happy to see MIPS64 port still works.

daniel-santos commented 4 years ago

Hello. I've gotten the OK to make a mipsel board available. These are mt7620-boards, so MIPS 24KEc, with 256MiB of ram. They are not great for building, I cross-compile via OpenWRT and then run the testsuite on the board. I can have them hosted in Fort Lauderdale, Florida. Will this help?

I don't currently have a full toolchain working on the board, but I can probably get one working if you need to build on the board too (it will just be slow).

milanknezevic commented 4 years ago

Hello, we would like to dedicate two cavium,rhino_utm8 boards with debian linux for mips and mipsle builders. They have quad-core cpu, 8GB of ram and 240GB ssd disks. Should we start by following these steps https://github.com/golang/go/wiki/DashboardBuilders?

bradfitz commented 4 years ago

@milanknezevic, that sounds great! Thanks.

Would you be the builder's listed owner? Should we name it after you, or your organization? (who is "we", btw?)

milanknezevic commented 4 years ago

@bradfitz Sorry, we are RT-RK, company based in Novi Sad, Serbia. The owner would be @bogojevic, and I would help him setup the builders.

bradfitz commented 4 years ago

@milanknezevic, I can set up the config for them after I get a bit more info. It seems that that processor is a 64-bit MIPS64 processor? And it seems that MIPS64 can also run 32-bit MIPS code? Can you confirm? Are you able to run both a cross-compiled GOARCH=mips and GOARCH=mips64 binary on one host and GOARCH=mipsle and GOARCH=mips64le on the other? Are the two boards the same hardware but boot in different endianness modes/kernels?

If so, should I set up each board to test both 32- and 64- bit GOARCHes?

/cc @cherrymui who probably knows how this works.

cherrymui commented 4 years ago

64-bit MIPS processor almost always can run 32-bit MIPS code. I think the kernel can be configured to support either ABI, or both (probably the default).

cherrymui commented 4 years ago

To clarify, I think a 32-bit kernel can only run 32-bit code, a 64-bit kernel can support either ABI or both.

milanknezevic commented 4 years ago

@milanknezevic, I can set up the config for them after I get a bit more info. It seems that that processor is a 64-bit MIPS64 processor? And it seems that MIPS64 can also run 32-bit MIPS code? Can you confirm?

Yes, that is correct, processor is MIPS64 and kernel is 64bit, but GOARCH=mips64 has 32-bit rfs, so I'm not sure if 64bit version is going to work, it should, but haven't tried it yet on this board.

Are you able to run both a cross-compiled GOARCH=mips and GOARCH=mips64 binary on one host and GOARCH=mipsle and GOARCH=mips64le on the other?

all.bash scripts are run on both of them successfully but only for GOARCH=mips and GOARCH=mipsle. Didn't try for 64-bit versions.

Are the two boards the same hardware but boot in different endianness modes/kernels?

The boards are the same hardware with different configuration of endianness.

If so, should I set up each board to test both 32- and 64- bit GOARCHes?

/cc @cherrymui who probably knows how this works.

For now, you can setup 32-bit versions, and we will try to run GOARCH=mips64 and GOARCH=mips64le and give you the results. Currently, we can see one problem with GOARCH=mips64, we do not have 64-bit gdb.

bradfitz commented 4 years ago

Currently, we can see one problem with GOARCH=mips64, we do not have 64-bit gdb.

If it works at all, that's great. We can configure it to skip the gdb test.

gopherbot commented 4 years ago

Change https://golang.org/cl/204043 mentions this issue: dashboard: add MIPS builders at RT-RK

bradfitz commented 4 years ago

@milanknezevic, I sent https://go-review.googlesource.com/c/build/+/204043 with the builder config for our side. Let me know if that looks okay. Notably, note the GOROOT_BOOTSTRAP value.

Email me (github username at golang.org) and I'll get you keys.

milanknezevic commented 4 years ago

I copied golang bootstraps in /usr/local/go-bootstrap. all.bash on mips64le board is finished successfully, and on mips64 board, which now has gdb, there is a few gdb tests failing and TestLinuxSendfile test is failing too. I will investigate it further.

Also, I think that you misplaced "GOHOSTARCH=mipsle", // avoids an extra make.bash phase, you probably wanted to add it to mips64le builder.

gopherbot commented 4 years ago

Change https://golang.org/cl/205797 mentions this issue: stage0: add support for MIPS builders to stage0

bradfitz commented 4 years ago

@milanknezevic, @bogojevic, the RTRK MIPS builders are running an old version of the buildlet (they're reporting "Version 21").

You should be using the golang.org/x/build/cmd/buildlet/stage0 binary to launch the buildlet; it will download the latest buildlet binary per run if needed.

We will be deleting some code from the build system soon here that drops support for Version 22 and under. Your builders are reporting Version 21. Please either update to master or start using the stage0 binary.

Thanks!

bogojevic commented 4 years ago

@bradfitz, I'll take a look. The buildlet binary on both boards are from March 9th. When I manually start stage0 binary, download me the same buildlet again.

bradfitz commented 4 years ago

@bogojevic, oh, if you were already using the stage0, perhaps we just had old buildlets uploaded to GCS. I just rebuilt buildlet.linux-mips64 & buildlet.linux-mips64le and they're re-uploaded.

bradfitz commented 4 years ago

MIPS all looks happy. Closing.

bradfitz commented 4 years ago

(Thanks again, RT-RK!)