ianlancetaylor commented 2 years ago

Background

This proposal started out as a GitHub discussionn at https://github.com/golang/go/discussions/53060.

Go supports a number of different GOOS/GOARCH targets. We've defined a policy for adding new ports, described at https://go.dev/wiki/PortingPolicy.

Ports are divided into first class ports and secondary ports. The current first class ports are:

darwin/amd64
darwin/arm64
linux/386
linux/amd64
linux/arm
linux/arm64
windows/386
windows/amd64

The current secondary ports are:

aix/ppc64
dragonfly/amd64
freebsd/386
freebsd/amd64
freebsd/arm
freebsd/arm64
illumos/amd64
linux/ppc64
linux/ppc64le
linux/mips
linux/mipsle
linux/mips64
linux/mips64le
linux/riscv64
linux/s390x
android/386
android/amd64
android/arm
android/arm64
ios/arm64
ios/amd64
js/wasm
netbsd/386
netbsd/amd64
netbsd/arm
netbsd/arm64
openbsd/386
openbsd/amd64
openbsd/arm
openbsd/arm64
openbsd/mips64
plan9/386
plan9/amd64
plan9/arm
solaris/amd64
windows/arm
windows/arm64

The core Go team maintains the first class ports. It is less clear how the secondary ports are handled.

The existing porting policy says:

Every secondary port must have a maintainer who will make required updates in a timely manner.
- Every secondary port must have a builder, and a person who maintains that builder.
- If the builder for a port is failing for more than two weeks, a new maintainer is needed.
- If a builder fails for more than four weeks or is failing at the time of a release freeze, and a new maintainer cannot be found, the port will be removed from the tree.

However, in practice we do not follow those rules.

It has not been clear who the secondary port maintainers are, and there was no mechanism for adding or removing maintainers; this was recently improved by creating the @golang/port-maintainers GitHub team and subteams.
When a change requires an update to the port, it is the person making the change who takes responsibility for updating the port, not the port maintainer.
Maintainer responsibilities are in general unclear.
Several secondary port builders are maintained by the Go team.
We do not remove ports from the tree if they are failing for four weeks (we do remove ports if the OS is no longer supported, such as darwin/386).

The effect is that the work required to maintain secondary ports falls on people who are not familiar with those ports. This was not the goal of the porting policy, and it tends to slow down overall development of the core Go systems and discourages the adoption of new secondary ports.

We propose both loosening and tightening the current porting policy to address these concerns.

Proposal

The Go team will stop publishing binaries for secondary ports.
- The Go team currently publishes binaries for the following secondary ports, which we would stop doing if we adopt this part of the proposal:
- freebsd/386
- freebsd/amd64
- linux/ppc64le
- linux/s390x
- windows/arm64
- We encourage port maintainers to publish them instead; we can link to download pages from https://go.dev/dl as appropriate.
- We will investigate building binary releases by cross-compiling to secondary ports from a first-class port. This may be feasible by doing pure Go builds, while leaving any cgo builds to be on the user's machine using the user's C compiler.
- Note: when and if reliable testing hardware is available, the Go team is likely to promote windows/arm64 to be a first class port.
Require at least two maintainers for each secondary port.
- If maintainers resign or become unreachable and can't be replaced, we mark the port as broken; see below.
- This means that for some existing secondary ports we will have to find additional maintainers. As I write this I think that additional maintainers are needed for android, dragonfly, illumos, ios, js/wasm, openbsd, and solaris.
In general developers will not be required to make their code work on every secondary port.
- They are encouraged to do so if it is straightforward.
- If not, they are encouraged to notify port maintainers about pending requirements as soon as feasible, and to work with the port maintainers to get the ports working again.
- That said, ultimately the port maintainers are responsible for keeping their ports working.
Notify maintainers when a port breaks.
- Open an issue, CC'ing maintainers
- Perhaps also send an e-mail to golang-dev.
- We can do this preemptively if we know that some change is going to break a secondary port.
- Ideally this is an automated system, but for now it may be manual.
- For example, perhaps the system that creates https://build.golang.org/ can send e-mails about all failures to a newly created list. Maintainers can filter that list for the ports that they care about.
- Currently maintainers can track #52653 to see problems.
Clarify maintainer responsibilities on the PortingPolicy wiki page.
- A GOOS maintainer is responsible for any _GOOS files, any files with just that build tag, and blocks of code guarded by if runtime.GOOS == "mygoos".
- Similarly for a GOARCH maintainer.
- If technically possible, GOOS/GOARCH maintainers who are not already approvers should be given the right to +2 changes to files for which they are responsible.
- These CLs will still require a +1 review for style and sanity but not for detailed correctness; we will defer to the maintainers for that.
- Note that with two maintainers one of them can +2 changes by the other.
We will introduce a new concept: the broken port.
- The existing policy says that problems must be fixed within 4 weeks, which is not realistic; people are busy.
- If a port stops working, including the case where a builder stops working, we can decide to mark the port as broken.
- Or in some cases we can roll back the change that broke it; this is a judgement call.
- In general, a port can be considered broken if its builder has failed multiple times in a development cycle with a failure mode that does not occur for first class ports, and that failure mode is not believed to have been fixed or suppressed by a change in either a Go repository or the builder's configuration, and maintainers are not actively working on a solution.
- Any approver can declare that a port that meets these criteria is broken.
- The list of broken ports will be maintained in cmd/dist and will appear in the release notes if broken at release time.
- Attempting to build a broken port will fail unless make.bash or go tool dist is invoked with a new option -force.
- If a port is broken in release 1.N, then the core Go team can choose to remove the port from release 1.N+1.
- This is not obligatory and will depend on whether anybody is willing and able to maintain the port going forward.
- We currently have a list of incomplete ports in cmd/dist/build.go; these should probably be treated as broken ports.
- Currently the only entry on that list is linux/sparc64.
- The goal here is not to get ports out of the tree; if people are actively working on the port they should have as much as latitude as possible to fix it.
The default set of trybots will change to only cover first class ports.
- The https://build.golang.org/ page will continue to show all ports.

Discussion

This is not intended to be a big change to the current process. However, it is intended to be a change. It is intended to take some of the porting load off of the core Go team, while making it easier for port maintainers to make changes. It is intended to make it easier to add new ports to the tree.

In the long run it would be good to support out of tree ports. However, that requires a bunch of technical work, and there is no design for it.

bcmills commented 2 years ago

Currently the value of CGO_ENABLED affects both the build of the tolchain and the setting of cgo within the toolchain when it is used to build something.

Indeed: cmd/dist generates go/build.defaultCGO_ENABLED with a value based on the value of CGO_ENABLED. That would presumably need to be changed. https://cs.opensource.google/go/go/+/master:src/cmd/dist/buildgo.go;l=118;drc=7d87ccc860dc31c0cd60faf00720e2f30fd37efb

ianlancetaylor commented 2 years ago

This is a proposed change. I agree that it does not work today.

rsc commented 2 years ago

It sounds like we need to resolve how feasible it is and how much work it is to post cross-compiled toolchains for secondary ports. Otherwise it sounds like people are mostly on board. We'll work to figure that out.

ianlancetaylor commented 2 years ago

I opened #53862 for cross-compiled releases.

laboger commented 2 years ago

Otherwise it sounds like people are mostly on board.

I'm still a bit concerned about the language for broken ports and removal from releases. Some of it is vague and unclear how it will be managed. I could see cases where changes are merged right before the freeze date that cause consistent breakage in secondary ports that might not be simple to fix. I'm assuming that once the freeze happens there is still time during the freeze period to fix these problems, but it will be possible for other fixes merged during the freeze period to cause breakage in a secondary port and there might not be much time to fix those. Previously we at least had some trybot compile testing for secondary ports before merging which prevented some issues but it sounds like even those won't be run. I hope there could be some latitude in fixing bugs introduced under these circumstances but there is no mention of these scenarios.

Also can be you be more specific on what you mean by "removal from releases"? If it is removed from a point release, does that mean it could be fixed and enabled for some future point release, and if so how does a port that has been removed get enabled again?

ianlancetaylor commented 2 years ago

What I wrote in the proposal is "If a port is broken in release 1.N, then the core Go team can choose to remove the port from release 1.N+1". That is, there is a two stage process.

In release 1.21 (say) we decide that the plan9/arm port (say) is broken. That is, the builder is consistently failing in some way that doesn't happen on other ports, and nobody is working to fix it. We will list it as a broken port in the release notes. If somebody tries to build the port, we will give an error saying something like "this port is broken; to build it use the -force option." The -force option to make.bash or cmd/dist will build the port in the usual way even though it is considered broken.

If somebody fixes the plan9/asm port after the 1.21.1 release (say), we can remove it from the list of broken ports for 1.21.2.

If the plan9/arm port is broken for the entire 1.21 cycle, and still nobody is working on it, then the Go team can declare that port is dead. This is not obligatory. It should only be done if it seems clear that the port is not going to be fixed for the foreseeable future. This dead port can then be removed from the source code for the 1.22 release.

The goal here is not to reject ports as quickly as possible. If people are working on the port then there will be plenty of latitude.

I would be happy to hear suggestions for better language in the proposal text. Thanks.

laboger commented 2 years ago

Here are comments on some of the language, some a few suggestions:

We should not assume that maintainers will respond within 2 or 4 weeks; people are busy.

What does this mean? After 4 weeks then something happens?

In general, a port can be considered broken if its builder has failed multiple times in a development cycle with a failure mode that does not occur for first class ports, and that failure mode is not believed to have been fixed or suppressed by a change in either a Go repository or the builder's configuration

add something like: and maintainers are not actively working on a resolution.

Any approver can declare that a port is broken.

add: if it meets the conditions of a broken port.

In the most recent post you say:

If people are working on the port then there will be plenty of latitude.

I think this is a very good statement. If that was included somewhere in the policy for broken ports, that would address some of the questions I had in other statements of the policy.

ianlancetaylor commented 2 years ago

We should not assume that maintainers will respond within 2 or 4 weeks; people are busy.

What does this mean? After 4 weeks then something happens?

This was a reference to the existing policy, which says "If a builder fails for more than two weeks, it is time to start looking for a more active maintainer for the port. If a builder fails for more than four weeks or is failing at the time of a release freeze, and a new maintainer cannot be found, the port will be removed from the tree."

Will clarify the text.

In general, a port can be considered broken if its builder has failed multiple times in a development cycle with a failure mode that does not occur for first class ports, and that failure mode is not believed to have been fixed or suppressed by a change in either a Go repository or the builder's configuration

add something like: and maintainers are not actively working on a resolution.

Done.

Any approver can declare that a port is broken.

add: if it meets the conditions of a broken port.

Done.

In the most recent post you say:

If people are working on the port then there will be plenty of latitude.

I think this is a very good statement. If that was included somewhere in the policy for broken ports, that would address some of the questions I had in other statements of the policy.

Done.

Thanks.

rsc commented 2 years ago

It sounds like we need to resolve how feasible it is and how much work it is to post cross-compiled toolchains for secondary ports. Otherwise it sounds like people are mostly on board. We'll work to figure that out.

I spoke with @heschi, and the release team is OK with committing to posting cross-compiled toolchains for secondary ports with an appropriate disclaimer on them. I think that was the last part we were waiting for. (#53862 will need a fix, but that should not be hard.)

Are there any objections to accepting this proposal? Thanks!

laboger commented 2 years ago

spoke with @heschi, and the release team is OK with committing to posting cross-compiled toolchains for secondary ports with an appropriate disclaimer on them.

What is the disclaimer? I think the main thing is that we can get a fix for #53862.

jonathan-albrecht-ibm commented 2 years ago

I've been looking at how the s390x release is built and its already being cross compiled using a s390x C cross compiler (from debian package gcc-s390x-linux-gnu) for cgo on an amd64 host. The only issue I know of is that the go env of the s390x release ends up with CC="s390x-linux-gnu-gcc" so when using the release you have to override it and set CC=gcc to compile cgo programs. I think that should be easily fixable.

I think keeping the current s390x cross compiled release method would be preferable to https://github.com/golang/go/issues/53862. I realize a C cross compiler might not be readily available for all os/arches but for many it is. Could we consider this kind of cross compile for releases?

Refs: https://github.com/golang/build/blob/bdb5eef875d42f91abfcf53e9c9c25a123ac3035/dashboard/builders.go#L178-L183 https://github.com/golang/build/blob/master/env/crosscompile/linux-s390x-cross/Dockerfile https://github.com/golang/build/blob/fa54b01647f9778cd86163f627ac0f8eda6f7b31/internal/releasetargets/releases.txt#L28-L29

heschi commented 2 years ago

@laboger The disclaimer would be something to express that secondary ports may not be as stable as the first-class ports; I'm not sure what the wording would be exactly. But, for example, we might choose to cut a new release while there are known failing tests for a secondary port, and users should be aware that that sort of thing is a possibility.

@jonathan-albrecht-ibm Maintaining the cross-compilation builders is work that we would like to stop doing. Can you elaborate on why the current mode is preferable?

jonathan-albrecht-ibm commented 2 years ago

@heschi Actually I misunderstood what a user would need to do after downloading a release to build with cgo. I had thought it might require changes to some build scripts but I see now I was wrong. I should have read your earlier comments more carefully.

In any case, it wasn't obvious to me why not just use the C cross compiler packages where available since that's already in place. Certainly reducing builder maintenance could be worth the trade off. Just want to make sure I understand the reasoning.

rsc commented 2 years ago

We can already cross-compile with our own toolchain. No need to add the complexity of another toolchain. The CC= issue is easy to fix, as is #53862.

rsc commented 2 years ago

Based on the discussion above, this proposal seems like a likely accept. — rsc for the proposal review group

4a6f656c commented 2 years ago

Are there any objections to accepting this proposal? Thanks!

While I think the overall general direction is sensible, I still have some concerns:

Given that a port is either a first class (by choice of the Go team), second class or broken, does this imply that all of the builders/ports currently marked as having issues (all of NetBSD, Plan9, Solaris, etc) will be immediately treated as broken ports?
The language around making code work on a second class port is still very soft - I would much prefer to see this as "need to make a reasonable effort" rather than "are encouraged to do so". While I understand the intent, it has potential to make it very difficult to keep secondary ports working.

ianlancetaylor commented 2 years ago

@4a6f656c The current proposal text says that a port will only be broken if nobody is actively working on a solution. So, no, all ports will not be immediately marked as broken. And we wouldn't do that anyhow, the goal here is not to eliminate ports, is to create a sustainable policy. At some level we have to assume good will on all sides.

4a6f656c commented 2 years ago

@4a6f656c The current proposal text says that a port will only be broken if nobody is actively working on a solution. So, no, all ports will not be immediately marked as broken. And we wouldn't do that anyhow, the goal here is not to eliminate ports, is to create a sustainable policy.

@ianlancetaylor thanks - as far as I can see, these ports already meet the criteria specified for broken ports (hence currently being marked as having issues on the build page) - is that not the case? Or is the plan to make this call as the next development cycle progresses?

At some level we have to assume good will on all sides.

Indeed.

4a6f656c commented 2 years ago

The default set of trybots will change to only cover first class ports.

I also expect that this is going to result in a lot of unintended/unexpected breakage on second class ports, especially where an architecture is not covered by the first class ports (e.g. mips or riscv64) - is there a reason not to continue to ensure that the code will at least build on all ports? Or is this only referring to the actual execution of tests (i.e. the misc-compile-<secondclassport> builders will not be disabled as part of this)?

ianlancetaylor commented 2 years ago

as far as I can see, these ports already meet the criteria specified for broken ports (hence currently being marked as having issues on the build page) - is that not the case?

Is anybody actively working on them? Do they have maintainers who intend to address the issues?

I don't want to be draconian about it, but if nobody is maintaining them, and they don't work, then perhaps they should be marked as broken.

Let's be clear, though: a few failing tests shouldn't lead to a port being broken. In particular, it's always OK to skip the tests on that port (there should be an open issue for any such cases). As the policy written above says: "In general, a port can be considered broken if its builder has failed multiple times in a development cycle with a failure mode that does not occur for first class ports, and that failure mode is not believed to have been fixed or suppressed by a change in either a Go repository or the builder's configuration, and maintainers are not actively working on a solution." Note in particular that it's fine to suppress a failure mode. Of course that's not going to work if the port simply isn't working at all, but it's sufficient if there are some test failures.

the misc-compile- builders will not be disabled as part of this

Agreed.

4a6f656c commented 2 years ago

as far as I can see, these ports already meet the criteria specified for broken ports (hence currently being marked as having issues on the build page) - is that not the case?

Is anybody actively working on them? Do they have maintainers who intend to address the issues?

I do not know the answer to that, however the issues linked are:

https://github.com/golang/go/issues/49756 (last update 29th January 2022)
https://github.com/golang/go/issues/51443 (last update 7th March 2022)
https://github.com/golang/go/issues/50138 (last update 21st April 2022)
https://github.com/golang/go/issues/42212 (last update 28th April 2022)

Which means there have been no updates within the last 90 days.

I don't want to be draconian about it, but if nobody is maintaining them, and they don't work, then perhaps they should be marked as broken.

Right, and I'm not sure that it is being draconian, rather just adhering to the proposed policies.

I guess put another way, what is the relationship between a builder/port marked with "has known issue" and a "broken port" (are they the same thing)?

Let's be clear, though: a few failing tests shouldn't lead to a port being broken. In particular, it's always OK to skip the tests on that port (there should be an open issue for any such cases). As the policy written above says: "In general, a port can be considered broken if its builder has failed multiple times in a development cycle with a failure mode that does not occur for first class ports, and that failure mode is not believed to have been fixed or suppressed by a change in either a Go repository or the builder's configuration, and maintainers are not actively working on a solution." Note in particular that it's fine to suppress a failure mode. Of course that's not going to work if the port simply isn't working at all, but it's sufficient if there are some test failures.

Sure, although there is also a fine balance between the extreme of disabling a large number of tests to make the port "non-broken" versus having some reasonable level of stability and a small number of tests being disabled (with corresponding issues). That said, in most of the cases above, these seem to be long standing issues with (as far as I can tell) no clear or immediate resolution.

laboger commented 2 years ago

I can see where you would only want to run the default trybots when making a change that touches only general purpose files that may or may not affect secondary ports. But in the case where there are changes to files that are specific to a secondary port, it seems like those CLs should run trybots for those targets. Otherwise changes could be merged which don't even compile and break the build for that port.

rsc commented 2 years ago

@laboger That sounds like an interesting idea. Thanks. I filed #54375.

rsc commented 2 years ago

No change in consensus, so accepted. 🎉 This issue now tracks the work of implementing the proposal. — rsc for the proposal review group

ianlancetaylor commented 2 years ago

I've updated https://go.dev/wiki/PortingPolicy with the changes described in this proposal.

bcmills commented 2 years ago

@ianlancetaylor, is there anything remaining to do for this proposal? (Can it be closed as complete?)

ianlancetaylor commented 2 years ago

What remains here is completion of #53862, and adding the notion of a broken port to cmd/dist (along with a way to build a broken port).

ianlancetaylor commented 2 years ago

Filed #56679 to track broken ports.

That completes the work for this proposal. Thanks to all the commenters.

ernado commented 1 year ago

@ianlancetaylor can you please clarify:

1) What is a process for becoming a port maintainer? 2) How can maintainers publish a binary for port? (https://github.com/golang/go/issues/59113)

I'm currently working on bringing riscv64 to Kubernetes (and friends), some dependencies rely on Docker Official Image packaging for golang, which gets binaries from https://dl.google.com/go and I'm not sure that maintainers will accept unofficial binaries.

Thank you!

tianon commented 1 year ago

Speaking as one of the maintainers of that image, we do support non official builds (there's a case statement in our build that will either download or build from the official sources); the problem with riscv64 is support in mainstream distros (still edge only in Alpine, still ports only in Debian, for example).

ianlancetaylor commented 1 year ago

@ernado

What is a process for becoming a port maintainer?

See https://go.dev/wiki/PortingPolicy. Basically, at least for an existing port, file an issue and get the current port maintainers to agree.

Our hope is to be able to publish binaries for all supported ports. See #53862.

golang / go

clarify Go support policy for secondary ports #53383

Background

Proposal

Discussion