Open zhouguangyuan0718 opened 2 years ago
Change https://go.dev/cl/398735 mentions this issue: cmd/compile,link: generating dwarf type info in compiler
There is no user visible change here (except that debug info is better), so this doesn't have to go through the proposal process. The linker/compiler/runtime team can decide what to do here.
CC @golang/runtime
I will send CL tree to implement it formally.
Change https://go.dev/cl/399059 mentions this issue: cmd/link,compile: refacte dwarf generation of linker
Change https://go.dev/cl/399294 mentions this issue: cmd/internal/dwarf: define interface for dwarf type info
Change https://go.dev/cl/399062 mentions this issue: cmd/link: extract newtype1 function
Change https://go.dev/cl/399295 mentions this issue: cmd/link: complate decoupling for newtype
Change https://go.dev/cl/399063 mentions this issue: cmd/link,cmd/internal/dwarf: move newtype to dwarf package
Change https://go.dev/cl/399064 mentions this issue: cmd/compile: implement dwarf.Type interface for types.Type
Change https://go.dev/cl/399296 mentions this issue: cmd/internal/dwarf: clarify symbol name and dwarfname
Change https://go.dev/cl/399297 mentions this issue: cmd/internal/dwarf: clarify Type and Sym
Change https://go.dev/cl/399298 mentions this issue: cmd/link,obj/dwarf: unify DefGoType and DefPtrTo
Change https://go.dev/cl/399301 mentions this issue: cmd/internal/obj/dwarf: implement some base method for dwCtxt
Change https://go.dev/cl/399302 mentions this issue: cmd/compile: generate dwarf type info as aux sym of type
@zhouguangyuan0718 thanks for writing this up and sending a first set of patches. Let me know when they are ready for review.
Moving type generation from linker to compiler is definitely a worthy project. If I were doing this work myself, there are some other cleanups I would also consider folding in as well. Specifically:
the way the linker currently does type DIE generation is more "C-like" than it should be (basically since the code was originally written in C and then converted/translated to Go). The DWDie/DWAttr setup is also more memory-intensive than it needs to be (these structs also have a lot of pointers, making extra work for the GC).
it is also confusing to have two entirely separate ways of creating DWARF DIEs in the the toolchain (e.g. the code that emits subprogram and related DIEs has one way of creating the DWARF, and the linker code has an entirely different way via DWDie/DWAttr). Ideally we would want to unify these two.
With that said, these cleanups are independent and can be tackled separately at some future time.
Change https://go.dev/cl/399275 mentions this issue: cmd/link: use the dwarf type info generated by compiler
@thanm thanks for your reply. I'm glad to know it is a worthy project. Now it can simply work with CL 399275 (with some bugs I’m investigating) and the relation chain of it. At this time, the dwarf type info is generated in both compiler and linker. In compiler, generate the type difined in current compile unit except the synthesize type. In linker, generate all the others info. I will move the remaining step by step. To avoid my idea is on the wrong track, Could you take a little look for the current patchsets ?It begin with CL 399059. Thanks very much.
I want to cleanups something, too. I didn't add duplicate code in compiler, I moved some code from linker to cmd/internal/dwarf. We can use them in both compiler and linker. In the future, it can be unified more easy.
To avoid my idea is on the wrong track, Could you take a little look for the current patchsets ?It begin with CL 399059. Thanks very much.
I took a very quick skim. In general looks to be moving in the right direction.
I am not sure about https://go-review.googlesource.com/c/go/+/399302 however-- what is the purpose of making DWARF type die symbols into aux symbols attached to type symbols? If the type DIE symbol is an aux, it means you can't look it up by name. This seems to me that it will force a lot of rewriting/post-processing in the linker.
it means you can't look it up by name.
Sorry, maybe I use a incorrect combine of aux and pkgDef for dwarf sym? It seems I can still look it up by name now. And I can also get them by aux of a type symbol.
what is the purpose of making DWARF type die symbols into aux symbols attached to type symbols?
According to the understanding I described above, I hope that all the dwarf type can be collected by using aux sym of reachable gotype and traverse the relocs of them at last. May be it is more fast than name lookup.
If I missed something which can prevent doing as this, I will put them in data.
Change https://go.dev/cl/399877 mentions this issue: cmd/compile: add a generator for synthesize dwarf type
It doesn't really make sense to have a symbol that is both aux and pkgdef -- the primary reason we make the other DWARF sub-symbols (ex: DWARFLINES, DWARFFCN) is that we never have to look them up by name. Better to have the DWARF type DIE symbols just be regular named symbols.
Thank you for explanation. Maybe dealing with the reloction of them correctly is a little dificult. I met many problem about relocations at the beginning. Particularly, some dwarf type symbol are nameless and duplicate. However, I will try to make them as regular symbols continue.
The other thing I can't do with certainty is CL 399877. Should I do it like this? Or put them into cmd/compile/internal/typecheck/builtin, then fix mkbuiltin.go and deal with them by a common way? Could you give me some advice?
The other thing I can't do with certainty is CL 399877. Should I do it like this? Or put them into cmd/compile/internal/typecheck/builtin, then fix mkbuiltin.go and deal with them by a common way? Could you give me some advice?
I'm not familiar with the DWARF or linker complexities here, but strictly from a cmd/compile point-of-view:
reflectdata.WriteBasicTypes
.)package internal
.)
- My first preference would be that the DWARF type descriptions are simply generated while compiling package runtime,
The dwarf type symbol can't be generated directly, a prototype is needed. As what now the linker do in src/cmd/link/internal/ld/dwarf.go:^synthesizeXXXtypes
. A basic dwarf type for map, slice, chan and string will be generated in newtype
. Then we need to synthesize the final dwarf info with the prototype of them. Maybe some implements can be simplifed in the future, but it seems we can't do this without the prototypes .
- If that's not doable, then extending mkbuiltin.go so that we have a single source of mock compiler definitions,
Maybe some work need to be done for mkbuiltin.go before I add them to builtin. mkbuiltin.go
can't support the complex struct type. I will try to fix it later. Now I will keep it as the CL do temporarily, For I can do synthesize continue.
As what now the linker do in src/cmd/link/internal/ld/dwarf.go:^synthesizeXXXtypes.
It looks like that code is accessing the Go runtime type descriptor's for the actual declared types within package runtime?
If so, I don't see a need for those types to be added to cmd/compile/internal/typecheck/builtin/runtime.go or to any similar files. It should be sufficient that they're declared within runtime/*.go already.
It looks like that code is accessing the Go runtime type descriptor's for the actual declared types within package runtime?
AFAIK, is seems like the code is accessing the dwarf type die of some Go runtime type.
If so, I don't see a need for those types to be added to cmd/compile/internal/typecheck/builtin/runtime.go or to any similar files. It should be sufficient that they're declared within runtime/*.go already.
In linker. it can access any symbol it need in all imported package and base package. If I move synthesizeXXXtypes to compiler, I need a mock type wich can describe these types. If not, for slice and string, they are simple enough, I can hard code and keep a relocation to runtime type for it. But for map and hchan, AFAIK, it is like the template in C++, I need to fill them when I generate these dwarf type in compiler. Maybe synthesizemaptypes and synthesizechantypes can show the necessity. so I think some mock type of them for synthesizetypes in compiler is needed. I'm sorry if I misunderstand.
Maybe the other way to avoid this is that we can still do synthesize in linker, in compiler, we only generate the intial dwarf type for map and chan, then we need to decode them to struct die and synthesize in linker. I'm worried about the cost if do as this. And for dynamic link, we can't do as this……
It doesn't really make sense to have a symbol that is both aux and pkgdef -- the primary reason we make the other DWARF sub-symbols (ex: DWARFLINES, DWARFFCN) is that we never have to look them up by name. Better to have the DWARF type DIE symbols just be regular named symbols.
Thank you very much, it can work after I use them as regular symbols, and it is more easy than use them as aux symbols.
Change https://go.dev/cl/399878 mentions this issue: cmd/link: refactor mkinternaltype and cleanup DefPtrTo
Change https://go.dev/cl/399880 mentions this issue: cmd/link,internal/dwarf: move synthesizeXXXtypes to internal/dwarf
Change https://go.dev/cl/399879 mentions this issue: cmd/link: decouple synthesizeXXXtypes with specific code in linker
Change https://go.dev/cl/400136 mentions this issue: cmd/compile: support define a pointer die with a die sym input
Change https://go.dev/cl/400137 mentions this issue: cmd/compile: synthesize types in compiler
Change https://go.dev/cl/400135 mentions this issue: cmd/link: unify the key of prototypedies in linker and compiler
I met a tricky problem for me. This testcase can't pass, because some dwarf types are defined in more than one compileunit. Should I skip the SDWARFTYPE symbol in this check ?
# go run run.go -- fixedbugs/issue30908.go exit status 1 cmd/link: while reading object for 'os': duplicate symbol 'go.info.*[]string', previous def at 'main', with mismatched payload: same length but different contents cmd/link: while reading object for 'os': duplicate symbol 'go.info.*interface {}', previous def at 'main', with mismatched payload: same length but different contents cmd/link: while reading object for 'io/fs': duplicate symbol 'go.info.*interface {}', previous def at 'main', with mismatched payload: same length but different contents cmd/link: while reading object for 'io/fs': duplicate symbol 'go.info.*io/fs.DirEntry', previous def at 'os', with mismatched payload: same length but different contents cmd/link: while reading object for 'io/fs': duplicate symbol 'go.info.*[]string', previous def at 'main', with mismatched payload: same length but different contents cmd/link: while reading object for 'runtime': duplicate symbol 'go.info.*error', previous def at 'os', with mismatched payload: same length but different contents cmd/link: while reading object for 'runtime': duplicate symbol 'go.info.*int64', previous def at 'os', with mismatched payload: same length but different contents cmd/link: while reading object for 'runtime': duplicate symbol 'go.info.*uintptr', previous def at 'os', with mismatched payload: same length but different contents cmd/link: while reading object for 'runtime': duplicate symbol 'go.info.*int32', previous def at 'io', with mismatched payload: same length but different contents cmd/link: while reading object for 'runtime': duplicate symbol 'go.info.*uint8', previous def at 'main', with mismatched payload: same length but different contents cmd/link: while reading object for 'runtime': duplicate symbol 'go.info.*uint16', previous def at 'io', with mismatched payload: same length but different contents cmd/link: while reading object for 'runtime': duplicate symbol 'go.info.*uint64', previous def at 'os', with mismatched payload: same length but different contents cmd/link: while reading object for 'runtime': duplicate symbol 'go.info.*float64', previous def at 'sort', with mismatched payload: same length but different contents cmd/link: while reading object for 'sync': duplicate symbol 'go.info.*uint32', previous def at 'runtime', with mismatched payload: same length but different contents cmd/link: while reading object for 'syscall': duplicate symbol 'go.info.*unsafe.Pointer', previous def at 'runtime', with mismatched payload: same length but different contents cmd/link: while reading object for 'syscall': duplicate symbol 'go.info.*syscall.Timespec', previous def at 'os', with mismatched payload: same length but different contents cmd/link: while reading object for 'syscall': duplicate symbol 'go.info.*uint32', previous def at 'runtime', with mismatched payload: same length but different contents cmd/link: while reading object for 'internal/poll': duplicate symbol 'go.info.*syscall.Sockaddr', previous def at 'syscall', with mismatched payload: same length but different contents cmd/link: while reading object for 'internal/poll': duplicate symbol 'go.info.*syscall.Iovec', previous def at 'syscall', with mismatched payload: same length but different contents cmd/link: while reading object for 'time': duplicate symbol 'go.info.*time.Duration', previous def at 'os', with mismatched payload: same length but different contents cmd/link: while reading object for 'time': duplicate symbol 'go.info.*time.Time', previous def at 'os', with mismatched payload: same length but different contents cmd/link: while reading object for 'time': duplicate symbol 'go.info.*[]uint8', previous def at 'main', with mismatched payload: same length but different contents cmd/link: while reading object for 'time': duplicate symbol 'go.info.*uint32', previous def at 'runtime', with mismatched payload: same length but different contents cmd/link: while reading object for 'io/ioutil': duplicate symbol 'go.info.*[]io/fs.FileInfo', previous def at 'os', with mismatched payload: same length but different contents cmd/link: while reading object for 'internal/reflectlite': duplicate symbol 'go.info.**uint8', previous def at 'runtime', with mismatched payload: same length but different contents cmd/link: while reading object for 'internal/reflectlite': duplicate symbol 'go.info.*internal/reflectlite.Value', previous def at 'sort', with mismatched payload: same length but different contents cmd/link: while reading object for 'internal/reflectlite': duplicate symbol 'go.info.*unsafe.Pointer', previous def at 'runtime', with mismatched payload: same length but different contents cmd/link: while reading object for 'internal/reflectlite': duplicate symbol 'go.info.*int16', previous def at 'runtime', with mismatched payload: same length but different contents cmd/link: while reading object for 'internal/cpu': duplicate symbol 'go.info.*uint32', previous def at 'runtime', with mismatched payload: same length but different contents cmd/link: while reading object for 'runtime/internal/atomic': duplicate symbol 'go.info.*uint32', previous def at 'runtime', with mismatched payload: same length but different contents cmd/link: while reading object for 'runtime/internal/atomic': duplicate symbol 'go.info.*unsafe.Pointer', previous def at 'runtime', with mismatched payload: same length but different contents cmd/link: while reading object for 'internal/abi': duplicate symbol 'go.info.*unsafe.Pointer', previous def at 'runtime', with mismatched payload: same length but different contents cmd/link: while reading object for 'strings': duplicate symbol 'go.info.*uint32', previous def at 'runtime', with mismatched payload: same length but different contents FAIL fixedbugs/issue30908.go 1.179s
This testcase can't pass, because some dwarf types are defined in more than one compileunit
This looks like an actual bug to me. We expect content to match for dupok symbols (if not, it's nearly always an indication of a real problem).
DWARF type die symbols should follow the same rules that type descriptor symbols do, with respect to where they are defined and whether they are emitted as "dupok".
For named types, what we want is for the compiler to emit type DIE symbol definition only only for things defined in the package. These should be regular symbols, not dupok symbols (since by construction each package has a distinct package path).
For non-named, types, like the "[13]complex16" below, when compiling package "foo" we have to allow for the possibility of some other package also having a variable of "[13]complex64" type, so the strategy would be to emit an appropriately named DWARF type DIE symbol but make it dupok.
package foo
var X [13]complex16
In addition:
For type descriptor symbols, the compiler has a hard-coded set of very basic, common types that it only emits definitions for when building the runtime. So for example when you compile some other package (not the runtime) and the compiler needs to refer to "type.uint8", it just emits a reference and assumes that the runtime will have a definition of that type descriptor symbol. DWARF generation should work the same way-- when compiling the "syscall" package (for example), the compiler should not be emitting a DWARF type DIE symbol like "go.info.*uint32", since that DIE will be defined by the runtime.
See https://go.googlesource.com/go/+/b55a2fb3b0d67b346bac871737b862f16e5a6447/src/cmd/compile/internal/reflectdata/reflect.go#1394 (related).
Let me know if this makes sense. These sorts of issues are normally hashed out during a design review or as part of discussion/comment on design doc, but that sort of got sidestepped in this case.
@thanm, I'm not sure I followed your explanation of why the DWARF type DIEs shouldn't be aux symbols. To me it seems pretty natural that, since the type descriptor symbols already have the naming and dupok properties we need, that we should just hang the DWARF equivalent off the type descriptor symbol. You mentioned that this would require a lot of post-processing in the linker; what sort of post-processing were you concerned about?
Consider this package:
package bar
type X [99]uint64
func A(p1 X) uint64 {
return p1[0]
}
Compiler generates this as a DWARF subprogram DIE for "A":
aux for "".A<0> SDWARFFCN size=55
0x0000 03 62 61 64 2e 41 00 00 00 00 00 00 00 00 00 00 .bad.A..........
0x0010 00 00 00 00 00 00 00 01 9c 02 00 00 00 01 11 70 ...............p
0x0020 31 00 00 05 00 00 00 00 01 9c 11 7e 72 30 00 01 1..........~r0..
0x0030 05 00 00 00 00 00 00 .......
rel 0+0 t=22 type.uint64<0>+0
rel 7+8 t=1 "".A<1>+0
rel 15+8 t=1 "".A<1>+6
rel 36+4 t=31 go.info."".X<0>+0
rel 49+4 t=31 go.info.uint64<0>+0
In the initial set of patches sent by zhouguangyuan0718@, type DIEs were being emitted as aux symbols attached to type descriptor symbols. However subprogram DIE generation was unchanged (param types were still references to "go.info.<type>"
. In the code above you can see relocs against go.info."".X
and go.info.uint64
. If the type DIE for the type "X" is an aux symbol (and not a named pkgdef symbol) then we would have to have some sort of fixup phase in the linker that would take convert the go.info."".X reference and rewrite it to the aux symbol. Or something like that.
Let me know if this answers your question.
Thanks for the explanation. Clearly we wouldn't be able to reference the "go.info.X" symbols like that if the type DIEs are in aux symbols, but that seems like more a problem with the references than with the idea of using aux symbols. Wouldn't it work to have a new relocation type that's like R_DWARFSECREF
, but that references the Go type descriptor and resolves to the DWARF offset of its type DIE? Or is that just more complicated than having a parallel set of named symbols for the type DIEs? (In general I'm just weary of anything that depends on name lookups in the linker.)
Wouldn't it work to have a new relocation type that's like R_DWARFSECREF, but that references the Go type descriptor and resolves to the DWARF offset of its type DIE?
Cool idea, yeah, that seems like it could work. Of course all of those relocations would have to be treated specially by the linker (since it's a sort of "wink wink nudge nudge, I'm pointing to X but really to Y"), but it sounds doable.
@thanm Thanks for your advising. I did not concern the dupok of type description before. I will try it.
About aux symbol, at the beginning, I hope that the type die sum can be find out by both type description and name. But seems it is not a normal way. So, I make them as regular named symbol and access them by name and reloc. It works well except the bug above.
Wouldn't it work to have a new relocation type that's like R_DWARFSECREF, but that references the Go type descriptor and resolves to the DWARF offset of its type DIE?
Cool idea, yeah, that seems like it could work. Of course all of those relocations would have to be treated specially by the linker (since it's a sort of "wink wink nudge nudge, I'm pointing to X but really to Y"), but it sounds doable.
It's an interesting way. But what I worry about is for dynamic link. When dynamic link, we don't have all the type description in linker. I know dynamic link is not a common case. But my original purpose to do this is to debug the dynlink lib and exe. So I hope I can do this with a balanced solution.
Update: Maybe we can use different reloc type for static link and dynamic link. I will also try to do as this.
This looks like an actual bug to me. We expect content to match for dupok symbols (if not, it's nearly always an indication of a real problem).
I misunderstood what this testcase mean before. I found out the main reason of the failure is that the ptr die sym defined by DefPtrTo missed DW_AT_go_kind attributes before. It missed it in linker before my patchset. In https://github.com/golang/go/blob/d2552037426fe5a190c74172562d897d921fe311/src/cmd/link/internal/ld/dwarf.go#L733 It is no effect in linker. But in compiler, because of duplicate gereration, it failed. I fixed it in compiler in my patchset, it seems works well now.I will send a separate CL to fix it in linker for master branch later.
@thanm Thanks for your advising. I did not concern the dupok of type description before. I will try it.
Besides above, the patchset is done as what you advised about the dupok property. The dwarf type die inherited the dupok propety from the correspond type description. And I mark the ptr defined by DefPtrTo and the internal type of hchan and hmap as dupok. Because I think they are unnamed, too. Did I misunderstand?
Let me know if this makes sense.
@thanm Yes, It cleared my confusion about the tricky symbols. And it works well now.
The newest patchset when I write this comment is patchset 6 of CL 400137. It contains what I'm writing in this comment.
In the next steps, besides what I list in the content of this issue. I'm planing to do some functional cleanups and refactor.
But before I continue to do these and ready to review. I hope I can make sure the patchset now is correct and works well on more builders. I can only run all.bash on my windows machine and wsl on it. Could you help me run trybots on the newest patchset? It would be better if I could get the try-bots access in issue #52249.
Change https://go.dev/cl/400138 mentions this issue: cmd/link: remove the dwarf type info generation code in linker
Change https://go.dev/cl/400139 mentions this issue: cmd/compile,internal/dwarf: cleanup some code
@thanm, now all the CL I have sent is already workable. I will do the others for this issue continue and change these CL s not too much. If there is some no functional refactor I need to do, I hope I can do it at the cleanup CL at the end of them.
So, could you start to review them? Or do you need I do some compress for the CLs? There is some temporary code mixed in them. It is a little confusing. they are already removed at the top CLs. But if I keep it as it is, you can understand the details in the CLs more clearly.
I'll try to free up some time to review in the next couple of days (my queue is a bit full at the moment).
@mdempsky Sorry, maybe my explanation in https://github.com/golang/go/issues/52209#issuecomment-1097475287 is not detailed enough. Consider this code:
package main
import "fmt"
func main() {
m := make(map[int]int)
m[1] = 1
fmt.Println(m)
}
I build it use go1.17 and then use readelf -wi to decode the dwarf info. Then we can find that the dwarf type info of map[int]int is more than what we can see in language level. I have excerpted key information:
<1><69112>: Abbrev Number: 32 (DW_TAG_typedef) <69113> DW_AT_name : map[int]int <6911f> DW_AT_type : <0x6ca7f> <69123> Unknown AT value: 2900: 21 <69124> Unknown AT value: 2904: 0x9680 <6912c> Unknown AT value: 2901: <0x5800c> <69130> Unknown AT value: 2902: <0x5800c>
In dwarf info, the type map[int]int
is a typedef, the actual type is *hash<int,int>
:
<1><6ca7f>: Abbrev Number: 33 (DW_TAG_pointer_type) <6ca80> DW_AT_name : *hash<int,int> <6ca8f> DW_AT_type : <0x6c9e5> <6ca93> Unknown AT value: 2900: 0 <6ca94> Unknown AT value: 2904: 0x0
*hash<int,int>
is a pointer to hash<int,int>
.
<1><6c9e5>: Abbrev Number: 37 (DW_TAG_structure_type) <6c9e6> DW_AT_name : hash<int,int> <6c9f4> DW_AT_byte_size : 48 <6c9f5> Unknown AT value: 2900: 0 <6c9f6> Unknown AT value: 2904: 0x0 <2><6c9fe>: Abbrev Number: 22 (DW_TAG_member) <6c9ff> DW_AT_name : count <6ca05> DW_AT_data_member_location: 0 <6ca06> DW_AT_type : <0x5800c> <6ca0a> Unknown AT value: 2903: 0 <2><6ca0b>: Abbrev Number: 22 (DW_TAG_member) <6ca0c> DW_AT_name : flags <6ca12> DW_AT_data_member_location: 8 <6ca13> DW_AT_type : <0x5790d> <6ca17> Unknown AT value: 2903: 0 <2><6ca18>: Abbrev Number: 22 (DW_TAG_member) <6ca19> DW_AT_name : B <6ca1b> DW_AT_data_member_location: 9 <6ca1c> DW_AT_type : <0x5790d> <6ca20> Unknown AT value: 2903: 0 <2><6ca21>: Abbrev Number: 22 (DW_TAG_member) <6ca22> DW_AT_name : noverflow <6ca2c> DW_AT_data_member_location: 10 <6ca2d> DW_AT_type : <0x57b38> <6ca31> Unknown AT value: 2903: 0 <2><6ca32>: Abbrev Number: 22 (DW_TAG_member) <6ca33> DW_AT_name : hash0 <6ca39> DW_AT_data_member_location: 12 <6ca3a> DW_AT_type : <0x578e0> <6ca3e> Unknown AT value: 2903: 0 <2><6ca3f>: Abbrev Number: 22 (DW_TAG_member) <6ca40> DW_AT_name : buckets <6ca48> DW_AT_data_member_location: 16 <6ca49> DW_AT_type : <0x6c9c6> <6ca4d> Unknown AT value: 2903: 0 <2><6ca4e>: Abbrev Number: 22 (DW_TAG_member) <6ca4f> DW_AT_name : oldbuckets <6ca5a> DW_AT_data_member_location: 24 <6ca5b> DW_AT_type : <0x6c9c6> <6ca5f> Unknown AT value: 2903: 0 <2><6ca60>: Abbrev Number: 22 (DW_TAG_member) <6ca61> DW_AT_name : nevacuate <6ca6b> DW_AT_data_member_location: 32 <6ca6c> DW_AT_type : <0x57809> <6ca70> Unknown AT value: 2903: 0 <2><6ca71>: Abbrev Number: 22 (DW_TAG_member) <6ca72> DW_AT_name : extra <6ca78> DW_AT_data_member_location: 40 <6ca79> DW_AT_type : <0x5e068> <6ca7d> Unknown AT value: 2903: 0 <2><6ca7e>: Abbrev Number: 0
hash<int,int>
is a struct, actually it is runtime.hmap
. In hash<int,int>
, it reference the type *bucket<int,int>
by field buckets and oldbuckets.
<1><6c9c6>: Abbrev Number: 33 (DW_TAG_pointer_type) <6c9c7> DW_AT_name : *bucket<int,int> <6c9d8> DW_AT_type : <0x6c96f> <6c9dc> Unknown AT value: 2900: 0 <6c9dd> Unknown AT value: 2904: 0x0
*bucket<int,int>
is a pointer to bucket<int,int>
.
<1><6c96f>: Abbrev Number: 37 (DW_TAG_structure_type) <6c970> DW_AT_name : bucket<int,int> <6c980> DW_AT_byte_size : 144 <6c982> Unknown AT value: 2900: 0 <6c983> Unknown AT value: 2904: 0x0 <2><6c98b>: Abbrev Number: 22 (DW_TAG_member) <6c98c> DW_AT_name : tophash <6c994> DW_AT_data_member_location: 0 <6c995> DW_AT_type : <0x5bfb2> <6c999> Unknown AT value: 2903: 0 <2><6c99a>: Abbrev Number: 22 (DW_TAG_member) <6c99b> DW_AT_name : keys <6c9a0> DW_AT_data_member_location: 8 <6c9a1> DW_AT_type : <0x6c92d> <6c9a5> Unknown AT value: 2903: 0 <2><6c9a6>: Abbrev Number: 22 (DW_TAG_member) <6c9a7> DW_AT_name : values <6c9ae> DW_AT_data_member_location: 72 <6c9af> DW_AT_type : <0x6c94e> <6c9b3> Unknown AT value: 2903: 0 <2><6c9b4>: Abbrev Number: 22 (DW_TAG_member) <6c9b5> DW_AT_name : overflow <6c9be> DW_AT_data_member_location: 136 <6c9c0> DW_AT_type : <0x6c9c6> <6c9c4> Unknown AT value: 2903: 0 <2><6c9c5>: Abbrev Number: 0
bucket<int,int>
is a struct, actually it is runtime.bmap
. It reference []key<int>
and []val<int>
by field keys and vals.
<1><6c92d>: Abbrev Number: 28 (DW_TAG_array_type) <6c92e> DW_AT_name : []key<int> <6c939> DW_AT_type : <0x5800c> <6c93d> DW_AT_byte_size : 64 <6c93e> Unknown AT value: 2900: 0 <6c93f> Unknown AT value: 2904: 0x0 <2><6c947>: Abbrev Number: 25 (DW_TAG_subrange_type) <6c948> DW_AT_type : <0x57809> <6c94c> DW_AT_count : 8 <2><6c94d>: Abbrev Number: 0
<1><6c94e>: Abbrev Number: 28 (DW_TAG_array_type) <6c94f> DW_AT_name : []val<int> <6c95a> DW_AT_type : <0x5800c> <6c95e> DW_AT_byte_size : 64 <6c95f> Unknown AT value: 2900: 0 <6c960> Unknown AT value: 2904: 0x0 <2><6c968>: Abbrev Number: 25 (DW_TAG_subrange_type) <6c969> DW_AT_type : <0x57809> <6c96d> DW_AT_count : 8 <2><6c96e>: Abbrev Number: 0
[]key<int>
and []val<int>
are array of int.
<1><5800c>: Abbrev Number: 27 (DW_TAG_base_type) <5800d> DW_AT_name : int <58011> DW_AT_encoding : 5 (signed) <58012> DW_AT_byte_size : 8 <58013> Unknown AT value: 2900: 2 <58014> Unknown AT value: 2904: 0x68c0
We can see that the runtime.hmap
and runtime.bmap
are like template types, every instantiate type has different name. Particularly the bucket<T,T>(runtime.bmap)
, []val<T>
and []key<T>
, the size of them will be affected by the elem type.
And more important, these types are pseudo types, we can't generate them from any gotype, we can only synthesize them.
So, without prototype of of runtime.hmap
and runtime.bmap
, it is a little diffcult to generate whole type info in compiler.
Without them , we need to hard code the prototype in compiler dwarf generation, or do some post process in linker, or keep linker generate as it is.
After my patchset, the compile result of the code above is this(removed some no related info):
go.info.*[]uint8 SDWARFTYPE dupok size=23 0x0000 23 2a 5b 5d 75 69 6e 74 38 00 00 00 00 00 16 00 #*[]uint8....... 0x0010 00 00 00 00 00 00 00 ....... rel 10+4 t=31 go.info.[]uint8+0 rel 15+8 t=-32763 type.*[]uint8+0 go.info.[]uint8 SDWARFTYPE dupok size=59 0x0000 25 5b 5d 75 69 6e 74 38 00 18 17 00 00 00 00 00 %[]uint8........ 0x0010 00 00 00 00 00 00 00 18 61 72 72 61 79 00 00 00 ........array... 0x0020 00 00 00 00 18 6c 65 6e 00 08 00 00 00 00 00 18 .....len........ 0x0030 63 61 70 00 10 00 00 00 00 00 00 cap........ rel 11+8 t=-32763 type.[]uint8+0 rel 19+4 t=31 go.info.uint8+0 rel 31+4 t=31 go.info.*uint8+0 rel 42+4 t=31 go.info.int+0 rel 53+4 t=31 go.info.int+0 go.info.*[8]uint8 SDWARFTYPE dupok size=24 0x0000 23 2a 5b 38 5d 75 69 6e 74 38 00 00 00 00 00 16 #*[8]uint8...... 0x0010 00 00 00 00 00 00 00 00 ........ rel 11+4 t=31 go.info.[8]uint8+0 rel 16+8 t=-32763 type.*[8]uint8+0 go.info.[8]uint8 SDWARFTYPE dupok size=31 0x0000 1e 5b 38 5d 75 69 6e 74 38 00 00 00 00 00 08 11 .[8]uint8....... 0x0010 00 00 00 00 00 00 00 00 1b 00 00 00 00 08 00 ............... rel 10+4 t=31 go.info.uint8+0 rel 16+8 t=-32763 type.[8]uint8+0 rel 25+4 t=31 go.info.uintptr+0 go.info.*[]int SDWARFTYPE dupok size=21 0x0000 23 2a 5b 5d 69 6e 74 00 00 00 00 00 16 00 00 00 #*[]int......... 0x0010 00 00 00 00 00 ..... rel 8+4 t=31 go.info.[]int+0 rel 13+8 t=-32763 type.*[]int+0 go.info.[]int SDWARFTYPE dupok size=57 0x0000 25 5b 5d 69 6e 74 00 18 17 00 00 00 00 00 00 00 %[]int.......... 0x0010 00 00 00 00 00 18 61 72 72 61 79 00 00 00 00 00 ......array..... 0x0020 00 00 18 6c 65 6e 00 08 00 00 00 00 00 18 63 61 ...len........ca 0x0030 70 00 10 00 00 00 00 00 00 p........ rel 9+8 t=-32763 type.[]int+0 rel 17+4 t=31 go.info.int+0 rel 29+4 t=31 go.info.*int+0 rel 40+4 t=31 go.info.int+0 rel 51+4 t=31 go.info.int+0 go.info.*[8]int SDWARFTYPE dupok size=22 0x0000 23 2a 5b 38 5d 69 6e 74 00 00 00 00 00 16 00 00 #*[8]int........ 0x0010 00 00 00 00 00 00 ...... rel 9+4 t=31 go.info.noalg.[8]int+0 rel 14+8 t=-32763 type.*[8]int+0 SDWARFTYPE size=35 0x0000 1e 6e 6f 61 6c 67 2e 5b 38 5d 69 6e 74 00 00 00 .noalg.[8]int... 0x0010 00 00 40 11 00 00 00 00 00 00 00 00 1b 00 00 00 ..@............. 0x0020 00 08 00 ... rel 14+4 t=31 go.info.int+0 rel 20+8 t=-32763 type.noalg.[8]int+0 rel 29+4 t=31 go.info.uintptr+0 go.info.noalg.[8]int SDWARFTYPE dupok size=18 0x0000 28 6e 6f 61 6c 67 2e 5b 38 5d 69 6e 74 00 00 00 (noalg.[8]int... 0x0010 00 00 .. rel 14+4 t=31 +0 go.info.*map.bucket[int]int SDWARFTYPE dupok size=34 0x0000 23 2a 6d 61 70 2e 62 75 63 6b 65 74 5b 69 6e 74 #*map.bucket[int 0x0010 5d 69 6e 74 00 00 00 00 00 16 00 00 00 00 00 00 ]int............ 0x0020 00 00 .. rel 21+4 t=31 go.info.noalg.map.bucket[int]int+0 rel 26+8 t=-32763 type.*map.bucket[int]int+0 SDWARFTYPE size=95 0x0000 27 6e 6f 61 6c 67 2e 6d 61 70 2e 62 75 63 6b 65 'noalg.map.bucke 0x0010 74 5b 69 6e 74 5d 69 6e 74 00 90 01 19 00 00 00 t[int]int....... 0x0020 00 00 00 00 00 18 74 6f 70 62 69 74 73 00 00 00 ......topbits... 0x0030 00 00 00 00 18 6b 65 79 73 00 08 00 00 00 00 00 .....keys....... 0x0040 18 65 6c 65 6d 73 00 48 00 00 00 00 00 18 6f 76 .elems.H......ov 0x0050 65 72 66 6c 6f 77 00 88 01 00 00 00 00 00 00 erflow......... rel 29+8 t=-32763 type.noalg.map.bucket[int]int+0 rel 47+4 t=31 go.info.[8]uint8+0 rel 59+4 t=31 go.info.noalg.[8]int+0 rel 72+4 t=31 go.info.noalg.[8]int+0 rel 89+4 t=31 go.info.uintptr+0 go.info.noalg.map.bucket[int]int SDWARFTYPE dupok size=30 0x0000 28 6e 6f 61 6c 67 2e 6d 61 70 2e 62 75 63 6b 65 (noalg.map.bucke 0x0010 74 5b 69 6e 74 5d 69 6e 74 00 00 00 00 00 t[int]int..... rel 26+4 t=31 +0 go.info.*map[int]int SDWARFTYPE dupok size=27 0x0000 23 2a 6d 61 70 5b 69 6e 74 5d 69 6e 74 00 00 00 #*map[int]int... 0x0010 00 00 16 00 00 00 00 00 00 00 00 ........... rel 14+4 t=31 go.info.map[int]int+0 rel 19+8 t=-32763 type.*map[int]int+0 go.info.map[int]int SDWARFTYPE dupok size=34 0x0000 22 6d 61 70 5b 69 6e 74 5d 69 6e 74 00 00 00 00 "map[int]int.... 0x0010 00 15 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 0x0020 00 00 .. rel 13+4 t=31 go.info.*hash<int,int>+0 rel 18+8 t=-32763 type.map[int]int+0 rel 26+4 t=31 go.info.int+0 rel 30+4 t=31 go.info.int+0 go.info.*map.hdr[int]int SDWARFTYPE dupok size=31 0x0000 23 2a 6d 61 70 2e 68 64 72 5b 69 6e 74 5d 69 6e #*map.hdr[int]in 0x0010 74 00 00 00 00 00 16 00 00 00 00 00 00 00 00 t.............. rel 18+4 t=31 go.info.noalg.map.hdr[int]int+0 rel 23+8 t=-32763 type.*map.hdr[int]int+0 SDWARFTYPE size=162 0x0000 27 6e 6f 61 6c 67 2e 6d 61 70 2e 68 64 72 5b 69 'noalg.map.hdr[i 0x0010 6e 74 5d 69 6e 74 00 30 19 00 00 00 00 00 00 00 nt]int.0........ 0x0020 00 18 63 6f 75 6e 74 00 00 00 00 00 00 00 18 66 ..count........f 0x0030 6c 61 67 73 00 08 00 00 00 00 00 18 42 00 09 00 lags........B... 0x0040 00 00 00 00 18 6e 6f 76 65 72 66 6c 6f 77 00 0a .....noverflow.. 0x0050 00 00 00 00 00 18 68 61 73 68 30 00 0c 00 00 00 ......hash0..... 0x0060 00 00 18 62 75 63 6b 65 74 73 00 10 00 00 00 00 ...buckets...... 0x0070 00 18 6f 6c 64 62 75 63 6b 65 74 73 00 18 00 00 ..oldbuckets.... 0x0080 00 00 00 18 6e 65 76 61 63 75 61 74 65 00 20 00 ....nevacuate. . 0x0090 00 00 00 00 18 65 78 74 72 61 00 28 00 00 00 00 .....extra.(.... 0x00a0 00 00 .. rel 25+8 t=-32763 type.noalg.map.hdr[int]int+0 rel 41+4 t=31 go.info.int+0 rel 54+4 t=31 go.info.uint8+0 rel 63+4 t=31 go.info.uint8+0 rel 80+4 t=31 go.info.uint16+0 rel 93+4 t=31 go.info.uint32+0 rel 108+4 t=31 go.info.*map.bucket[int]int+0 rel 126+4 t=31 go.info.*map.bucket[int]int+0 rel 143+4 t=31 go.info.uintptr+0 rel 156+4 t=31 go.info.unsafe.Pointer+0 go.info.noalg.map.hdr[int]int SDWARFTYPE dupok size=27 0x0000 28 6e 6f 61 6c 67 2e 6d 61 70 2e 68 64 72 5b 69 (noalg.map.hdr[i 0x0010 6e 74 5d 69 6e 74 00 00 00 00 00 nt]int..... rel 23+4 t=31 +0 go.info.*int SDWARFTYPE dupok size=19 0x0000 23 2a 69 6e 74 00 00 00 00 00 16 00 00 00 00 00 #*int........... 0x0010 00 00 00 ... rel 6+4 t=31 go.info.int+0 rel 11+8 t=-32763 type.*int+0 go.info.*uint8 SDWARFTYPE dupok size=21 0x0000 23 2a 75 69 6e 74 38 00 00 00 00 00 16 00 00 00 #*uint8......... 0x0010 00 00 00 00 00 ..... rel 8+4 t=31 go.info.uint8+0 rel 13+8 t=-32763 type.*uint8+0 go.info.[]keySDWARFTYPE dupok size=33 0x0000 1e 5b 5d 6b 65 79 3c 69 6e 74 3e 00 00 00 00 00 .[]key ..... 0x0010 40 00 00 00 00 00 00 00 00 00 1b 00 00 00 00 08 @............... 0x0020 00 . rel 12+4 t=31 go.info.int+0 rel 27+4 t=31 go.info.uintptr+0 go.info.[]val SDWARFTYPE dupok size=33 0x0000 1e 5b 5d 76 61 6c 3c 69 6e 74 3e 00 00 00 00 00 .[]val ..... 0x0010 40 00 00 00 00 00 00 00 00 00 1b 00 00 00 00 08 @............... 0x0020 00 . rel 12+4 t=31 go.info.int+0 rel 27+4 t=31 go.info.uintptr+0 go.info.bucket<int,int> SDWARFTYPE dupok size=87 0x0000 27 62 75 63 6b 65 74 3c 69 6e 74 2c 69 6e 74 3e 'bucket<int,int> 0x0010 00 90 01 00 00 00 00 00 00 00 00 00 18 74 6f 70 .............top 0x0020 68 61 73 68 00 00 00 00 00 00 00 18 6b 65 79 73 hash........keys 0x0030 00 08 00 00 00 00 00 18 76 61 6c 75 65 73 00 48 ........values.H 0x0040 00 00 00 00 00 18 6f 76 65 72 66 6c 6f 77 00 88 ......overflow.. 0x0050 01 00 00 00 00 00 00 ....... rel 38+4 t=31 go.info.[8]uint8+0 rel 50+4 t=31 go.info.[]key +0 rel 64+4 t=31 go.info.[]val +0 rel 81+4 t=31 go.info.*bucket<int,int>+0 go.info.*bucket<int,int> SDWARFTYPE dupok size=31 0x0000 23 2a 62 75 63 6b 65 74 3c 69 6e 74 2c 69 6e 74 #*bucket .............. rel 18+4 t=31 go.info.bucket<int,int>+0 rel 23+8 t=-32763 type.*bucket<int,int>+0 go.info.hash<int,int> SDWARFTYPE dupok size=154 0x0000 27 68 61 73 68 3c 69 6e 74 2c 69 6e 74 3e 00 30 'hash<int,int>.0 0x0010 00 00 00 00 00 00 00 00 00 18 63 6f 75 6e 74 00 ..........count. 0x0020 00 00 00 00 00 00 18 66 6c 61 67 73 00 08 00 00 .......flags.... 0x0030 00 00 00 18 42 00 09 00 00 00 00 00 18 6e 6f 76 ....B........nov 0x0040 65 72 66 6c 6f 77 00 0a 00 00 00 00 00 18 68 61 erflow........ha 0x0050 73 68 30 00 0c 00 00 00 00 00 18 62 75 63 6b 65 sh0........bucke 0x0060 74 73 00 10 00 00 00 00 00 18 6f 6c 64 62 75 63 ts........oldbuc 0x0070 6b 65 74 73 00 18 00 00 00 00 00 18 6e 65 76 61 kets........neva 0x0080 63 75 61 74 65 00 20 00 00 00 00 00 18 65 78 74 cuate. ......ext 0x0090 72 61 00 28 00 00 00 00 00 00 ra.(...... rel 33+4 t=31 go.info.int+0 rel 46+4 t=31 go.info.uint8+0 rel 55+4 t=31 go.info.uint8+0 rel 72+4 t=31 go.info.uint16+0 rel 85+4 t=31 go.info.uint32+0 rel 100+4 t=31 go.info.*bucket<int,int>+0 rel 118+4 t=31 go.info.*bucket<int,int>+0 rel 135+4 t=31 go.info.uintptr+0 rel 148+4 t=31 go.info.*runtime.mapextra+0 go.info.*hash<int,int> SDWARFTYPE dupok size=29 0x0000 23 2a 68 61 73 68 3c 69 6e 74 2c 69 6e 74 3e 00 #*hash<int,int>. 0x0010 00 00 00 00 16 00 00 00 00 00 00 00 00 ............. rel 16+4 t=31 go.info.hash<int,int>+0 rel 21+8 t=-32763 type.*hash<int,int>+0
Compiler generated all the symbols we need. In linker, these symbols only need to be reloced and then put them to dwarf info section directly. We can't implement this only rely to these type defined in runtime compile unit. It is same for runtime.hchan
.
Maybe I can hard code string and slice, they are simple enough and is not a template.
If I misunderstand something or still not clear your confusion, please let me know. Thanks very much for attention to this issue.
Thanks. The compiler already has code for synthesizing runtime type descriptors for map types though. For example: https://github.com/golang/go/blob/master/src/cmd/compile/internal/reflectdata/reflect.go#L199
Can we just reuse those type descriptions?
Alternatively, can we just using DW_TAG_typedef
to cross-reference bucket<int,int>
to runtime.bmap
? I suppose that loses information like the keys
and elems
fields, but are those important to DWARF users?
@mdempsky
Thanks. The compiler already has code for synthesizing runtime type descriptors for map types though. For example: https://github.com/golang/go/blob/master/src/cmd/compile/internal/reflectdata/reflect.go#L199
Can we just reuse those type descriptions?
I noticed these type descriptions before. For map, I think we can synthesize all the info from these. And the result can be as same as linker generate now.
Alternatively, can we just using
DW_TAG_typedef
to cross-referencebucket<int,int>
toruntime.bmap
? I suppose that loses information like thekeys
andelems
fields, but are those important to DWARF users
I I'm afraid we can't, this is a visible change for DWARF users, I can't determine, maybe CC @thanm .
And a more important reason is the difference of instances is not only the name of it, the size of bucket<T,T> is different,too. Consider this bucket<string,string>, the size and offest of it is different with bucket<int,int> in previous comment.
<1><64bcb>: Abbrev Number: 39 (DW_TAG_structure_type) <64bcc> DW_AT_name : bucket<string,string> <64be2> DW_AT_byte_size : 272 <64be4> Unknown AT value: 2900: 0 <64be5> Unknown AT value: 2904: 0x0 <2><64bed>: Abbrev Number: 24 (DW_TAG_member) <64bee> DW_AT_name : tophash <64bf6> DW_AT_data_member_location: 0 <64bf7> DW_AT_type : <0x649c7> <64bfb> Unknown AT value: 2903: 0 <2><64bfc>: Abbrev Number: 24 (DW_TAG_member) <64bfd> DW_AT_name : keys <64c02> DW_AT_data_member_location: 8 <64c03> DW_AT_type : <0x64c29> <64c07> Unknown AT value: 2903: 0 <2><64c08>: Abbrev Number: 24 (DW_TAG_member) <64c09> DW_AT_name : values <64c10> DW_AT_data_member_location: 136 <64c12> DW_AT_type : <0x64c4e> <64c16> Unknown AT value: 2903: 0 <2><64c17>: Abbrev Number: 24 (DW_TAG_member) <64c18> DW_AT_name : overflow <64c21> DW_AT_data_member_location: 264 <64c23> DW_AT_type : <0x64ba6> <64c27> Unknown AT value: 2903: 0 <2><64c28>: Abbrev Number: 0
Now, I have refactored https://go-review.googlesource.com/c/go/+/399877. Not add a new mkprototype.go anymore. I extend the mkbuiltin.go and use the type description in pseudo runtime package. Now I think it is cleaner than using mkprototype.go.
And I thought and thought how can we synthesize them without adding any hack things to compiler. If I can absolutely not add anything like this. There may be a feasible way:
For slice, string, iface and eface, they can be esaily hard code when we need creating a die for them.
For map, we can get all the info from the description compiler already synthesized before.
But for chan, there will be some visible change in the build result inevitably. compiler did't synthesize anything for it, we only have a runtime type description. So, we can't synthesize a die as it was. But I noticed that there is only one variable int hchanruntime.sudog
, I think it is not worthy to keep it as a templete type, we can use a directly typedef to define any chan T
as runtime.hchan
. And if I do as this, the dynamic link will loses dwarf type info for chan, I think it is not a good way. Certainly, I can't determine,too. CC @thanm.
I hope the behaviors of dwarf has no visible change , and I only add compile generation and dynamic link support in this issue. Certainly, If I can absolutely not add anything in mkbuiltin.go. I will do as what I considered above.
Change https://go.dev/cl/400634 mentions this issue: cmd/compile,link: support generate dwarf info when -linkshared
Background
I have focused on dynamic link (plugin,linkshared,buildemode-shared) in golang for two years. I noticed that the dynlink library can't be debuged. And I know that the reason of it is the dwarf type info is generated in linker. Related issues: https://github.com/golang/go/issues/38378 https://github.com/golang/go/issues/44982
Later, I saw this in https://github.com/golang/go/issues/47788#issuecomment-957967840.
@thanm said:
Thanks for @thanm , I have tried to implement generating dwarf type info in compile for a long time. Now it can work, though it is a very very initial version. I think it is the correct time to submit a proposal about it, and maybe I can improve it and contribute it to go community.
Proposal
The dwarf type info can be generated in compiler instead of linker. My initial version implement it with this way:
I will send a CL to share my very very initial prototype.
If this proposal is a little likely to be accepted, I will implement it continue. I think thers is much things todo. TODO:
Costs
Update 2022.5.10:
List an intial version todo list for formal modification: