golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
124.33k stars 17.7k forks source link

cmd/cgo: arrange to pass unmodified Go source files to compiler, go/types #16623

Open mdempsky opened 8 years ago

mdempsky commented 8 years ago

Currently cgo support is implemented with cmd/cgo as a separate tool that analyzes cgo-using Go code and transforms it into standard Go code. This proposal is to extract the analysis logic into a separate internal package that can be reused by cmd/compile and go/types directly without needing source rewrites. I.e., turn them into native cgo compilers, rather than needing a separate "cgofront" preprocessor step.

The expected benefits are:

  1. Better error messages. E.g., currently cmd/compile emits error messages about calls to _Cfunc_foo instead of C.foo.
  2. Better position information. Currently when using golang.org/x/tools/go/loader to analyze cgo-using source, the column location and byte offset information is wrong because it applies go/types to the post-transformation Go source files. Thanks to //line directives, we can at least identify the original line number, but it's troublesome/imprecise to write type-sensitive refactoring tools that rely on column information and want to support cgo-using packages.
  3. Better desugaring of cgo calls. Currently cmd/cgo needs to rewrite expressions into valid Go code to be accepted by cmd/compile, which is sometimes subtle (e.g., #7757, #9557, #13635, #13930, #16591). By desugaring within the compiler itself, we have more flexibility about the sorts of fundamental operations we can use.

Potential downsides and counter-arguments:

  1. cmd/cgo still needs to exist for gccgo, but cmd/cgo already has substantially different behavior for supporting cmd/compile vs gccgo.
  2. Might complicate #15681 (scheduling cmd/cgo earlier), but I expect an alternative strategy is equally viable: schedule the C compilations later. Currently the only reason we need to schedule the C compilations before the Go compiler is to produce _cgo_imports.go. But this file only contains //go:cgo_import_dynamic and //go:cgo_dynamic_linker directives, which don't actually affect Go compilation; cmd/compile just writes them back out again so cmd/link can ultimately handle them.
  3. Increases coupling between cmd/compile and cgo, but cmd/compile already is / needs to be cgo-aware, and cmd/go already has extensive special case code for cmd/cgo (unlike any other tool that generates Go source).

Alternative ideas that might achieve similar benefits while generalizing to tools other than cmd/cgo:

  1. Add a //go:errorname directive so cmd/cgo can express that _Cfunc_foo should be referred to as "C.foo" in error messages. It's not obvious this would generalize fully though; e.g., cmd/yacc rewrites $1 to yyDollar[1], which would need something more powerful to express.
  2. Generalize //line directives to something that records column/offset information too (similar to JavaScript Source Maps). Again, I'm not sure how useful in practice this would actually be for tools other than cmd/cgo.

/cc @ianlancetaylor @alandonovan @griesemer

minux commented 8 years ago

This means making cgo part of the compiler?

One of the reason cmd/cgo is separate it that other Go compilers could potentially benefit from that to support cgo. If it supports enough of cgo magic comments.

mdempsky commented 8 years ago

My idea is to keep cmd/cgo behaving the same as it does today, but to extract much of its functionality as a new package like "internal/cgo" that can be used by cmd/cgo, cmd/compile/internal/gc, and go/types.

Like I said above (downside 1), we need cmd/cgo for gccgo (because it's written in C++ and can't rely on a Go library).

minux commented 8 years ago

If much of the functionality is extracted into another package, would we still use cmd/cgo when building cgo programs?

If not, cmd/cgo might bit rot eventually.

mdempsky commented 8 years ago

I expect by default we would not execute cmd/cgo when building cgo programs. I think we could make that an option though (e.g., cgomode=internal/external), or at least run the current cgo regress tests both with and without cmd/cgo, if we're concerned about bit rot.

alandonovan commented 8 years ago

Better position information. Currently when using golang.org/x/tools/go/loader to analyze cgo-using source, the column location and byte offset information is wrong because it applies go/types to the post-transformation Go source files. Thanks to //line directives, we can at least identify the original line number, but it's troublesome/imprecise to write type-sensitive refactoring tools that rely on column information and want to support cgo-using packages.

It is indeed troublesome to write refactoring tools for cgo-using packages, but I don't see how the proposal solves the problem. Refactoring tools based on go/types must consume the output of cgo preprocessing, not the cgo source code itself, but that means any file surgery they do will be applied to the generated file---at best; //line directives further confuse matters. In this regard, cgo differs little from other code generation tools such as the protocol compiler: its input language is not Go, even though it may look like Go.

mdempsky commented 8 years ago

@alandonovan The proposal is that instead of generating intermediary Go source files, cmd/compile and go/types would consume the original .go source files directly. The refactored cmd/cgo package would provide the logic for processing the import "C" preambles such that (e.g.) go/types could directly construct an appropriate go/types.Package.

It's true that there are other tools that generate Go source code, but those produce .go files that the user is responsible for managing themselves. Cgo is distinguished from them because it's transparently handled by cmd/go, not to mention having explicit runtime and toolchain support.

alandonovan commented 8 years ago

I see. While developing the loader package, I tried a similar approach, modifying go/types to accept source files in the cgo dialect, but gri convinced me that it was feature creep and that cgo should be treated as a black box. Here are some of the problems that came up:

  1. The type checker rejects qualified identifiers C.f where f is unexported. It would have to make an exception for package "C".
  2. Package "C" appears different to every package that imports it. This breaks an invariant of the importer interface. (Though vendoring may have since forged a path here.)
  3. In cgo, you can call a function C.f as if it was overloaded with two different types, for example, x := C.f() and x, err := C.f(). This requires a hack to the type checking rule for function calls, and results in ast.AssignStmts whose types break the spec's assignability rules, confusing clients.

Also, we should consider that future extensions to cgo might not be possible to support at all in go/types.

mdempsky commented 8 years ago

Thanks for the list of problems. My thoughts:

  1. That's true, but go/types already special cases C.f when in FakeImportC mode. I don't imagine it to be any more invasive to change the later if !exp.Exported() check to something like if !exp.Exported() && !pkg.imported.cgo.
  2. True. The importer would also need a way to access the C preamble. Since C imports needs to be handled specially anyway, I was envisioning just handling them internally within go/types. If customizing its behavior proved useful, we could provide a CgoImporter interface.
  3. That's an interesting point. I see two possible ways to address it:
    1. Do something similar to how we already handle "commaok" expressions. We could potentionally reuse TypeAndValue.HasOk for this, or add TypeAndValue.HasCgoErr if cleaner.
    2. For each C function, define two go/types.Func objects: one for the single-value variant, and another for the two-value variant. Within go/types, we can arrange for Info.Uses, etc. to point to the contextually appropriate variant.

It seems unlikely to me that we'd add functionality to cmd/cgo that go/types could not support at all. The source Cgo files will need to remain representable by go/ast so that cmd/gofmt can format them. If it ever becomes necessary to process additional inputs beyond the .go files for Cgo processing, we can extend go/types.Checker's API to allow callers to provide them.

Lastly, for completeness: to support cgo in go/types, users will need to provide an AST with the C preamble comments, but by default go/parser discards comments. There are a couple ways to resolve this:

  1. Add a ParseCgo Mode flag or something to go/parser (and perhaps ScanCgo flag to go/scanner). This would indicate the resulting parse tree should preserve any comments used by cgo, but all other comments in the source file can be omitted.
  2. ParseCgo could just be the default behavior, or we just require users to always set ParseCgo/ParseComments when they want to analyze Cgo packages.
adg commented 8 years ago

Anyone have more comments for @mdempsky?

alandonovan commented 8 years ago

LGTM.

I prefer option 3(ii) over 3(i). We are likely to discover several awkward corner cases as we implement this and use it across the tools, but the result will still be an improvement on the status quo.

Having ParseCgo as the default would be nice, but it's fiddly because you have to select the scanner disposition towards comments before you begin. Two approaches to consider are: (i) restarting the scan from the beginning if you find an import "C". This is possible because the input is a byte slice not an io.Reader, but it might be hard to avoid duplicate error reporting. (ii) using ad-hoc logic in the parser to find the comment preceding the import "C" when you encounter the import. This is probably easier to implement.

rsc commented 8 years ago

This may be interesting but it seems lower priority than just about anything else we could be doing. And doing this is going to break lots of existing cgo code in mysterious way that will require a lot of debugging.

Is there something that just can't cope with the current way cgo works? It sounds like the answer is no.

rsc commented 8 years ago

I'm also generally skeptical about making cmd/compile include cmd/cgo. That's a layering violation: cgo is a build system-issue, not a compiling Go code issue. I don't want the compiler to turn into the build system.

rsc commented 8 years ago

Does that mean we should put SWIG (a C++ program) into cmd/compile too?

mdempsky commented 8 years ago

And doing this is going to break lots of existing cgo code in mysterious way that will require a lot of debugging.

I'm curious why you think this. My expectation is that cgo code will compile identically as before.

Is there something that just can't cope with the current way cgo works?

My motivating use case is it's currently impossible to write robust type-sensitive refactoring tools that work on Go source files that use cgo, because you have to apply go/types to the post-processed source.

I'm also generally skeptical about making cmd/compile include cmd/cgo. That's a layering violation: cgo is a build system-issue, not a compiling Go code issue. I don't want the compiler to turn into the build system.

I don't expect cmd/compile and go/types to take on any more responsibility for compiling C code than cmd/cgo already does internally. And x/tools/go/loader already has to run cmd/cgo.

Does that mean we should put SWIG (a C++ program) into cmd/compile too?

Point taken. Though I'll note the source for SWIG bindings are .swig, not .go, so I feel like there's a lower expectation that refactoring tools should---or even need to---support them. Certainly it would be nice if gorename could rename an identifier exported by C++ and have it also update the entire C++ code base, but I don't think anyone expects that.

However, I do think it would be nice if gorename would work for renaming symbols in files that happen to also use cgo. Currently, it replaces the file with the cgo-processed source. (I.e., you end up with a source file that says "Created by cgo - DO NOT EDIT" at the top.)

minux commented 8 years ago

For refactor tools, it just needs to know the types of each cgo symbol, right? In that case, all it needs is to know the symbol mangling rule of cmd/cgo and then it can figure out the necessary information from cgo output.

alandonovan commented 8 years ago

Currently, [gorename] replaces the file with the cgo-processed source. (I.e., you end up with a source file that says "Created by cgo - DO NOT EDIT" at the top.)

Wow, I was not aware of that. I've filed https://github.com/golang/go/issues/17839.

mdempsky commented 8 years ago

@minux cgo does more invasive AST rewrites than just mangling C.bar to _Cfoo_bar. For example, it also rewrites C function calls into function literal invocations to instrument _cgoCheckPointer calls.

Even if the solution is to just reconstruct the pre-cgo AST, there should be a standard library or x/tools package for handling that. Otherwise, everytime we change cgo's implementation details, we end up breaking third-party tools.

minux commented 8 years ago

But for a refactor tool, the cgo rewrite shouldn't matter (it's cgo implementation detail.) I imagine that refactor only cares that C.bar(1) is a function of type func (int32) (int, error), and then it should be able to handle C.bar as a special kind of function call and carry on (as the refactor tool should only care about the pre-cgo Go source files, right?)

mdempsky commented 8 years ago

@minux go/types doesn't currently support that. That's the point of this proposal.

rsc commented 8 years ago

To be clear, the primary expected benefit is not the three in the list but instead that go/types be modified to be able to more easily analyze packages that use cgo.

There's a lot of configuration that goes into the invocations of cgo. Today that configuration is partly in the environment and partly baked into the toolchain, specifically the go command. If cgo moves into a package, then that configuration has to be baked into the package too. That would mean you have to rebuild your, say, godoc just to start viewing a package with a different baked-in way to build arm binaries (say).

It seems like it would be better to introduce an easy way for the go command to provide the generated files, and then have go/types invoke the go command. Then you can substitute a different go command easily (via PATH changes) without having to rebuild everything that uses go/types when this kind of thing changes.

Also:

I'm curious why you think this. My expectation is that cgo code will compile identically as before.

Obviously that's the goal. My experience is that this is basically impossible for any significant change to cgo. Something always breaks. Many things, usually.

mdempsky commented 8 years ago

To be clear, the primary expected benefit is not the three in the list but instead that go/types be modified to be able to more easily analyze packages that use cgo.

That's my motivating case, yes. I meant for that to be covered by items 2 and 3. Sorry for the ambiguity.

There's a lot of configuration that goes into the invocations of cgo. Today that configuration is partly in the environment and partly baked into the toolchain, specifically the go command.

As far as I can tell, the information baked into cmd/go is just defaultCC, defaultCXX, and defaultPkgConfig. We could fetch them via go env if we wanted to avoid baking them into go/types tools. (I'm not convinced we need to though; go/build already bakes in its own cgo information rather than querying go tool dist list -json.)

It seems like it would be better to introduce an easy way for the go command to provide the generated files, and then have go/types invoke the go command.

The problem I'm proposing to address here isn't that invoking cgo is tricky (though it is): x/tools/go/loader already supports invoking cgo as needed.

The problem is when writing type-sensitive AST refactoring tools, we want to work with the original AST, not the rewritten AST. Otherwise you get issues like #17839 or mdempsky/unconvert#3.

rsc commented 8 years ago

Another point: if we create this library then the next request will be a way to use it in other tools instead of invoking cmd/cgo. So then we have to design the library well. Cgo, for better or worse, is already designed and works.

I think part of the problem with this proposal is that it specifies mechanism. It sounds like the problem you want to solve is how to make go/parser and go/types work on cgo programs in a way that allows, for example, gorename or gorefactor to edit cgo programs. I think it might make sense to shift this proposal to talk about ways to solve that problem. It may not involve cgo as a library at all. It still seems to me that cmd/cgo can be used in a solution to that problem, much more quickly than this refactoring.

rsc commented 8 years ago

I thought some more about this on my way home. Apologies for the high latency of this conversation overall. I will try to do better.

It wasn't clear to me until your last comment that the fundamental problem being addressed here is the problem of not being able to apply code rewriting tools to Go source files that import "C". Rereading the initial post I see that listed as benefit 2, but I did not appreciate that the problem was the lack of access to the original source code in order to write the source code back out.

The core of this proposal is really to eliminate source-to-source translation as the implementation mechanism for defining what import "C" means in cgo inputs. Whether cgo is a library is a separate implementation detail - a binary is just a library with a command line as its API - as is the choice of which other program calls into the code in whatever form it takes. Let's put those details aside and focus on what additional functionality would have to be added to cgo to avoid a source-to-source rewrite.

It seems to me that if cgo can't rewrite the code, it would instead need to write out a definition of the "C" package that answers whatever questions the compiler might have; the compiler would consult that definition as needed. Cgo would still need to write out _cgo_gotypes.go with the wrappers and type definitions; it just wouldn't need to write out modified copies of the original sources (x.cgo1.go for x.go). For example, cgo might write out a file, say, _cgo_defs.txt. Then the go command could invoke the compiler with -cgodefs _cgo_defs.txt to pass the cgo definitions in. Similarly, whatever x/tools/go/loader does to invoke cgo today it could still roughly do but expect to read _cgo_defs.txt instead of loading modified source files.

The next question is what _cgo_defs.txt looks like. A simple but extensible choice would be to define a Go struct Def with the info cgo needs to convey and write out a JSON map[string]*Def. To make this concrete I looked at how cgo translates package net, and it looks like the only information you really need to get to the compiler is the mapping from C.xxx to the Go name used in _cgo_gotypes.go. That's almost trivial enough not to bother with JSON, but functions have two names (depending on whether they are invoked as x, err := C.f() to get errno), so might as well make things minimally extensible. Looks like this would suffice (with a few more symbols to cover all of package net, of course):

{
    "AF_INET": {"GoName": "_Cconst_AF_INET"},
    "char": {"GoName": "_Ctype_char"},
    "getaddrinfo": {"GoName": "_Cfunc_getaddrinfo", "Go2Name": "_C2func_getaddrinfo"},
    "struct_addrinfo": {"GoName": "_Ctype_struct_addrinfo"}
}

With that extra information, the compiler and go/types could both work on the original source files. Cgo could keep writing the *.cgo1.go files for a while, to let older tools or build systems continue to use them.

What I've described handles everything in cgo except the new pointer check calls that @ianlancetaylor added. Those would need to be recreated in the compiler, but of course the compiler is in a better position to write them, and go/types need not care.

If you want to move forward with this, my suggestion would be:

  1. Change cmd/cgo to write _cgo_defs.txt in addition to the current outputs.
  2. Change cmd/go to pass _cgo_defs.txt and the unmodified source files (in place of the modified ones) to the compiler.
  3. Change cmd/compile to use _cgo_defs.txt to resolve names in the fake package C. Pass all.bash except cgo pointer checks.
  4. Change cmd/compile to add the cgo pointer checks as appropriate. Pass all.bash.
  5. Change x/tools/go/loader to use _cgo_defs.txt and the original source files.

The key part of the original proposal was the ability to have cmd/compile and go/types load unmodified source files and do the "translation" themselves. The _cgo_defs.txt file accomplishes that without significant changes to cgo itself or to the build process. The specific format of _cgo_defs.txt must be defined, but that API would have had to have been defined for "cgo as a library" too (along with a lot of other work that _cgo_defs.txt avoids).

In fact, you might actually get away with not having _cgo_defs.txt at all. You could instead just hard-code in the compiler the idea that after import "C", C.name is resolved by looking for _Cconst_name or _Ctype_name or _Cfunc_name or _C2func_name. That seems fine to me, especially for a quick prototype. Then the only change to the build process is to invoke the compiler on the original x.go instead of x.cgo1.go. Another possibility would be some kind of special comment on the definitions already in _cgo_gotypes.go.

//go:cgo_cname C.uint
type _Ctype_uint uint32

//go:cgo_cname C.getaddrinfo
//go:cgo_unsafe_args
func _C2func_getaddrinfo(p0 *_Ctype_char, p1 *_Ctype_char, p2 *_Ctype_struct_addrinfo, p3 **_Ctype_struct_addrinfo) (r1 _Ctype_int, r2 error) {

Whether that makes more sense depends a bit on what is best for x/tools/go/loader. Cgo could always write the comments for the compiler and the _cgo_defs.txt file for go/loader, I suppose.

The most important point is that we should focus on what additional functionality is needed from cgo instead of whether it is wrapped up in a library or a separate binary. The functionality can be added and used either way. (And if we leave it as a separate binary, that probably creates less work overall.)

ianlancetaylor commented 8 years ago

That's an interesting approach. I just want to clarify that the name translation is in fact context-specific, because of course C.f may be called both to return a single value and to return two values, and we need to know the context to know which translated name to use. We also translate C.f differently if it is used in an expression other than calling it.

Also, an example of a case where we need to know more than just the translated name is union types; in order to generate the cgo checks correctly to address issue #15942, we need to know whether the union type can contain a pointer.

These notes don't invalidate the basic idea, but they do complicate it.

mdempsky commented 8 years ago

Similarly, whatever x/tools/go/loader does to invoke cgo today it could still roughly do but expect to read _cgo_defs.txt instead of loading modified source files.

To be clear, after invoking cgo, all that x/tools/go/loader currently does with the modified source files is:

  1. Use go/parser to parse them into go/ast representation.
  2. Apply the go/types type checker.
  3. Return the go/ast representation and package information produced during type checking.

If we simply parse the original files instead of the cgo-rewritten files, that would seem to require that go/types needs to process the _cgo_defs.txt file instead. So unless I'm misunderstanding your proposal, it seems to require one or more of:

  1. Invoke cmd/cgo within go/types instead of x/tools/go/loader;
  2. Plumb _cgo_defs.txt through go/types' public API; and/or
  3. Duplicate go/types logic into x/tools/go/loader, with extensions to support _cgo_defs.txt.

I've been advocating for 1 (as an internal library instead, but as you point out this is an implementation detail to be considered separately).

2 seems acceptable too, but means committing to a public API between cmd/cgo and go/types. And if we're going to do that, it seems simpler to just make go/types invoke cmd/cgo instead (i.e., approach 1).

3 seems undesirable to me for maintainability.

Do you support any of these three approaches? Or did you have any alternatives in mind?

bcmills commented 8 years ago

The core of this proposal is really to eliminate source-to-source translation as the implementation mechanism for defining what import "C" means in cgo inputs.

That would likely also make #13467 more feasible to address, which may be relevant to the cost/benefit analysis here.

rsc commented 8 years ago

@mdempsky, agreed, that seems like yet another argument against _cgo_defs.txt. I think it's fine to start by hard-coding in both the compiler and go/types that to resolve C.foo you usually look for _Cconst_foo, _Cfunc_foo, or _Ctype_foo (in that order), and in the case x,y := C.foo() you instead look for _C2func_foo. That's a very tiny amount of code, OK to just write twice (once in go/types and once in the compiler). That doesn't mean changing anything about where cgo is invoked: x/tools/go/loader would still invoke cgo.

Today, given a package with x.go importing "C" and y.go not importing "C", x/tools/go/loader runs cgo on x.go and then passes x.cgo1.go, y.go, and _cgo_gotypes.go to go/types. That would change so that x/tools/go/loader would run cgo on x.go and pass x.go, y.go, and _cgo_gotypes.go to go/types. That is, the only change in the loader is to stop converting the name x.go to x.cgo1.go when preparing the list of files for go/types.

@ianlancetaylor, yes it seems like we'd need another //go:cgo_has_pointers or something for the type definition of a union or maybe anything including a union. Whatever is needed for the pointer check is fine. :-)

rsc commented 8 years ago

@mdempsky, another possibility is to adjust cgo's name mangling algorithm so that C.foo always means _Ccgo_foo or _Ccgo2_foo (for the two-result function call). You would probably have to use the list of names xxx that appear in C.xxx and use the Ccgo prefix for those names (only), with the old prefixes for all the other names that appear just to explain the originals. For example, this program works today:

package main

/*
typedef struct { int z; } X;
typedef struct { X *x; } Y;

#define X 10
Y* f(int x) { return 0; }
*/
import "C"

func main() {
    println(C.f(C.X))
}

because _Cconst_X (needed as C.X) and _Ctype_X (needed to explain C.f but not itself available as a C.name) are different identifiers. That would need to keep working. In this case _Cconst_X would become _Ccgo_X but _Ctype_X would stay _Ctype_X.

alandonovan commented 8 years ago

@mdempsky Perhaps this falls into the second of your three categories, but you could make go/types call through an interface up into the application whenever it sees an import "C" or a selection C.x (with sufficient context information to deal with function overloading, of course), and provide an external package that satisfies this interface by running cgo and parsing cgodefs.txt. Think of it as a special kind of types.Importer.

This would entail only a small number of changes to go/types and would avoid go/types depending on the implementation of cgo.

mdempsky commented 8 years ago

I think it's fine to start by hard-coding in both the compiler and go/types that to resolve C.foo you usually look for _Cconst_foo, _Cfunc_foo, or _Ctype_foo (in that order), and in the case x,y := C.foo() you instead look for _C2func_foo.

Okay, I think I'm reasonably sold on this approach (i.e., making cmd/compile and go/types aware of cmd/cgo's C.foo => _Cbar_foo mangling convention, and having them process the original Go source files + _cgo_gotypes.go) as the least-intrusive solution to the issues I pointed out.

another possibility is to adjust cgo's name mangling algorithm so that C.foo always means _Ccgo_foo or _Ccgo2_foo (for the two-result function call).

I was considering that too. Seems desirable to me to simplify compiler integration, but not critical.

It'll change runtime names though (e.g., stack traces and reflect type info). Is that a problem? cmd/cgo's documentation declares its mangling convention (albeit not under Go 1 compat), and some misc/cgo tests assume it too.

mdempsky commented 8 years ago

you could make go/types call through an interface up into the application whenever it sees an import "C" or a selection C.x (with sufficient context information to deal with function overloading, of course), and provide an external package that satisfies this interface by running cgo and parsing cgodefs.txt. Think of it as a special kind of types.Importer.

I'm not opposed to an extensible interface per se, but I'm worried about having to design and commit to one. Your comparison to types.Importer seems particularly worrying since I feel like that's been one of our more limiting interfaces as tools and use cases evolve.

alandonovan commented 8 years ago

@mdempsky That's rather pessimistic. types.Importer had to change because of a new feature (vendoring) in the build system. In general, new features in the language and the package system will require API changes to the go/* packages, possibly incompatible changes (as we saw with aliases). The design of cgo has been relatively stable; I think it's worth at least sketching out this approach.

mdempsky commented 8 years ago

FYI, I just uploaded proof-of-concept CLs 33677 and 33678, which implement the bare minimum of @rsc's suggested changes for go/types and x/tools/go/loader. And as intended, they fix #17839.

rsc commented 7 years ago

@mdempsky, this seems like a definite win. Let's do this for Go 1.9 and figure out whether to simplify the cgo output at the same time to make the job easier. (We're committing to an API, essentially, if we want go/types from one release to work with as many others as possible.)

mdempsky commented 7 years ago

I'm going to work on cleaning up my POC CLs to be submittable. @griesemer and I just discussed the go/types public API and believe we need a small extension:

Rationale:

gopherbot commented 7 years ago

CL https://golang.org/cl/33677 mentions this issue.

bradfitz commented 7 years ago

@mdempsky, if this isn't happening for Go 1.9, can you retarget for Go 1.10?

mdempsky commented 7 years ago

I updated CL 33677 last night. go/types adds the UsesCgo flag mentioned above and srcimporter uses it instead of FakeImportC when build.Context.OpenFile == nil. (cmd/cgo, gcc, etc. don't support using OpenFile; I tried copying source files to a temporary directory, but gcc can potentially #include arbitrary files from the source package directory.)

As a test case, it's able to successfully type check misc/cgo/test. (Also notably, but less impressively, it type checks the standard library with cgo enabled.)

One nit is that most of the time C.bar is rewritten to _Cfoo_bar (sometimes with fluff), but for variables they're rewritten to *_Cvar_bar, where _Cvar_bar is a pointer-variable initialized like var _Cvar_bar = (*T)(unsafe.Pointer(&__cgo_xxx)), but __cgo_xxx has type byte. So at least without changing cmd/cgo, there's no object that C.bar can resolve to with the same type as the expression C.bar itself needs to have.

For the CL, I bound C.bar to _Cvar_bar and add an implicit pointer-type dereference.

I also discovered there are quite a few more _Cfoo_ prefixes than initially discussed. Thankfully cmd/cgo is already strict about not allowing C.bar to resolve to different C definition kinds in different Go source files, so the order they're tested in doesn't matter.

mewmew commented 7 years ago

Thankfully cmd/cgo is already strict about not allowing C.bar to resolve to different C definition kinds in different Go source files, so the order they're tested in doesn't matter.

If I understand correctly, the intention is to potentially revisit that restriction (see #13467).

From @rsc: https://github.com/golang/go/issues/13467#issuecomment-265161883

@joegrasse, probably not, but if we do #16623 (let compiler know more about cgo) then the compiler would be in a position to resolve this, if we wanted to.

mewmew commented 7 years ago

Oh, I probably spoke too early. #13467 is about allowing C.bar to refer to the same C type across different Go packages. Thus, thus is not at odds with the restriction you pointed out.

gopherbot commented 7 years ago

Change https://golang.org/cl/77153 mentions this issue: cmd/cgo: modify source as text, not as AST

joegrasse commented 6 years ago

@mdempsky Just curious if you think this will make it into Go1.11?

mdempsky commented 6 years ago

@joegrasse Sorry, no. CL 33677 is still unsubmitted.

joegrasse commented 6 years ago

@mdempsky Is there a plan to submit it for Go1.12?

gopherbot commented 4 years ago

Change https://golang.org/cl/231459 mentions this issue: go/types: add UsesCgo config to support _cgo_gotypes.go

gopherbot commented 4 years ago

Change https://golang.org/cl/234526 mentions this issue: go/types: replace Config.UsesCgo with Checker.CgoFiles

gopherbot commented 4 years ago

Change https://golang.org/cl/237417 mentions this issue: go/types: rename UsesCgo to go115UsesCgo

gopherbot commented 4 years ago

Change https://golang.org/cl/237423 mentions this issue: go/packages: move TypecheckCgo to packagesinternal

gopherbot commented 4 years ago

Change https://golang.org/cl/237422 mentions this issue: internal/typesinternal: remove fallback check for UsesCgo

lotodore commented 2 years ago

Sorry for digging out this old issue... I just wanted to note that maybe the priority of this issue is underestimated.

Google is developing Carbon just because of interop with existing language, which confirms that this is something very important. At the same time, Go C bindings after many years are still kind of painful to use, also due to https://github.com/golang/go/issues/975, which has been around for years and keeps waiting for this very issue to be implemented.

Migration of existing applications to Go really could use some more endorsement, so it might be an option to reconsider the priority of this issue.