golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.38k stars 17.59k forks source link

runtime: make defs_$GOOS.go regeneratable #23341

Open hirochachacha opened 6 years ago

hirochachacha commented 6 years ago

It seems that defs_$GOOS.go used to generate defs_$GOOS_$GOARCH.go in the old days.

https://github.com/golang/go/blob/596e3d9c0176db442a51202a2ae2834ac892d594/src/runtime/defs_darwin.go#L7-L12

However, -cdefs is not a valid cgo option today and people started to add features ad hoc.

I think making those regeneratable again have some worthy.

odeke-em commented 6 years ago

/cc @ianlancetaylor @aclements

aclements commented 6 years ago

I agree that this is unfortunate, but it would also be quite difficult to get these into a generatable state. defs_$GOOS.go never generated defs_$GOOS_$GOARCH.go; it generated defs_$GOOS_$GOARCH.h, which was eventually converted to Go when we converted the rest of the runtime. Hence, it's not a matter of making then generatable again, but of making them generatable at all.

I don't think it would be impossible. go tool cgo -godefs produces something not entirely unlike like the defs files we have. However, the field names are all different (the most obvious difference being that they're all exported, but there are also many subtler naming differences). More problematic is that many of the field types are different. For example, just looking at defs_linux_amd64.go, we turned many pointers into uintptr and flattened out some union fields.

So, while this seems like a laudable goal, I'm not sure it's actually worth the engineering effort given how slowly and specially these files grow. IMO, it would be much more worth the time to auto-generate all the syscall definitions used by the runtime, since we already have most or all of the machine-readable definitions we need in the syscall package, most of the syscall code is very regular, we tend to need new syscalls across a large number of platforms, and the code is really annoying to write by hand.

ianlancetaylor commented 6 years ago

Automating the syscall package should be a matter of backporting the work that has been done in the x/sys/unix package.

aclements commented 6 years ago

Automating the syscall package should be a matter of backporting the work that has been done in the x/sys/unix package.

I think the std syscall package and x/sys/unix already use nearly the same code generators. I was thinking that these could be retargeted to produce the syscall assembly code used by the runtime, which is different code but can be derived from the same metadata.

ianlancetaylor commented 6 years ago

Sounds good, but x/sys/unix has diverged pretty far from syscall. For example, x/sys/unix now uses Docker images on GNU/Linux. Nobody has bothered to backport any of this work to syscall since syscall is frozen and there is no real need to regenerate the files.

hirochachacha commented 6 years ago

Thank you for your comments.

I don't think it would be impossible. go tool cgo -godefs produces something not entirely unlike like the defs files we have. However, the field names are all different (the most obvious difference being that they're all exported, but there are also many subtler naming differences). More problematic is that many of the field types are different. For example, just looking at defs_linux_amd64.go, we turned many pointers into uintptr and flattened out some union fields.

Are those changes not acceptable in theory? I know it's a hassle though.

IMO, it would be much more worth the time to auto-generate all the syscall definitions used by the runtime, since we already have most or all of the machine-readable definitions we need in the syscall package, most of the syscall code is very regular, we tend to need new syscalls across a large number of platforms, and the code is really annoying to write by hand.

OK, I'll start from there. But after that, I'd like to deal defs_$GOOS.go also if it is acceptable.

aclements commented 6 years ago

Are those changes not acceptable in theory? I know it's a hassle though.

If they were not significantly invasive to the rest of the runtime, then yes, that would be acceptable. I'm mostly worried about the differing field types. What might work well would be to have a runtime-specific generator that runs go tool cgo -godefs to do the heavy lifting and then does some simple rewriting of cgo's output. For example, it could unexport the fields and rewrite field types. For somewhat ad hoc things like the field types, it could be driven by a few simple directives in defs_$GOOS.go. And for the occasional really weird thing it's fine if it remains written by hand in defs_$GOOS.go (it'll just get copied). What do you think?

IMO, it would be much more worth the time to auto-generate all the syscall definitions used by the runtime

OK, I'll start from there.

That would be awesome.

Note that some of the syscalls don't follow the boilerplate and that's totally fine. Those can be stay written by hand. But there are a ton of syscall definitions across all of the arch/OS combinations that are pure boilerplate and should be generated.

(Just please don't write the generator in Perl like the one in the syscall package. Nothing against Perl, just that I don't know it so I wouldn't be able to modify the generator. :)

hirochachacha commented 6 years ago

For example, it could unexport the fields and rewrite field types. For somewhat ad hoc things like the field types, it could be driven by a few simple directives in defs$GOOS.go. And for the occasional really weird thing it's fine if it remains written by hand in defs$GOOS.go (it'll just get copied). What do you think?

I'd rather adjust other runtime code to use output of go tool cgo -godefs as possible even it it's hard work. I think cgo is already doing correct job. I may write cgo -godefs wrapper for these adjustments, but I want it to use raw output gradually.

Note that some of the syscalls don't follow the boilerplate and that's totally fine. Those can be stay written by hand. But there are a ton of syscall definitions across all of the arch/OS combinations that are pure boilerplate and should be generated.

I see, thanks.

(Just please don't write the generator in Perl like the one in the syscall package. Nothing against Perl, just that I don't know it so I wouldn't be able to modify the generator. :)

no worries :) I also don't know Perl. I'll use the language we know.

aclements commented 6 years ago

I'd rather adjust other runtime code to use output of go tool cgo -godefs as possible even it it's hard work.

There are technical reasons not to do this for everything, particularly the mismatched field types. Transforming various pointer-typed fields into uintptr avoids having the garbage collector scan these and avoids potentially producing write barriers for these fields (it probably also avoids some ugly casts). Similarly, some of the fields are uintptr instead of uint32/uint64 to make them consistent across 64- and 32-bit arches so the runtime code can be portable. There may also be some fields where it's fine for the type to change.

Making the fields exported probably doesn't matter as long as the types themselves aren't exported. Though if there are any transformations going on at all, this would be an easy one. Some of the field names are different in silly ways (like timecal.tv_sec vs timeval.sec) and that's fine to adjust in the runtime code.

My broader point is that adjusting other runtime code is fine, but do no harm. Things where the code can be adjusted without affecting the complexity are fine, but I'd be very weary of things that make the runtime code more complex.

hirochachacha commented 6 years ago

There are technical reasons not to do this for everything, particularly the mismatched field types. Transforming various pointer-typed fields into uintptr avoids having the garbage collector scan these and avoids potentially producing write barriers for these fields (it probably also avoids some ugly casts). Similarly, some of the fields are uintptr instead of uint32/uint64 to make them consistent across 64- and 32-bit arches so the runtime code can be portable. There may also be some fields where it's fine for the type to change.

Thank you for the detailed explanation. Now I understand how differing types corrupt the runtime. And yes, some of those can be solved by introducing new directives, some maybe not. I cannot predict those at this point.

My broader point is that adjusting other runtime code is fine, but do no harm. Things where the code can be adjusted without affecting the complexity are fine, but I'd be very weary of things that make the runtime code more complex.

Definitely. I'll consider that.

gopherbot commented 6 years ago

Change https://golang.org/cl/87115 mentions this issue: runtime: make sys file generatable