Open martisch opened 4 years ago
Interesting, thanks for pointing this out.
To summarize, in package unicode we have:
var Categories = map[string]*RangeTable{
"C": C,
...
}
var C = _C
var _C = &RangeTable{...}
This currently gets optimized into:
var Categories = map[string]*RangeTable{
"C": C,
...
}
var C = &_C_storage // <--
var _C = &_C_storage
var _C_storage = RangeTable{...}
The essence of this issue report is that we should further optimize it to:
var Categories = map[string]*RangeTable{
"C": &_C_storage, // <--
...
}
var C = &_C_storage
var _C = &_C_storage
var _C_storage = RangeTable{...}
and then on with the static map initialization logic.
@mdempsky
yes there is another optimization here as you point out for getting rid of the write barriers by making the map values point to globals instead of copying pointers (we may have to rewrite the tables in unicode for that if we cant make the compiler reason about this) which I think warrants its on CL and rewrite of unicode.
The generic map optimization in addition to the write barriers here is that I think we can do (and already do for some maps) is to always transform:
var Categories = map[type1]type2{
"static1": Var1,
"staric1": Var2,
...
}
to
var _GlobalMapKeys = [...]{static1, static2, ...}
var _GlobalMapValues =[...]{Var1, Var2, ...}
func init() {
GlobalMap = make(map[type1]type2, length)
for i := 0; i < len(_GlobalMapKeys); i++ {
GlobalMap[_GlobalMapKeys[i]] = _GlobalMapValues[i]
}
}
instead of
func init() {
GlobalMap = make(map[type1]type2, length)
GlobalMap["static1"] = Var1
GlobalMap["static2"] = Var2
...
}
Note that the current code even doesn't do the GlobalMap = make(map[type1]type2, length)
but gives no size hint which we should fix too.
The optimization already triggers if Var1 and Var2 are not variables but static too. My proposal is to extend the optimization to the case where the map values are are variables (for some values or all of them apart from statics).
@martisch Is your idea that
var _GlobalMapValues =[...]{Var1, Var2, ...}
would then involve dynamic initialization for copying Var1
or Var2
if necessary?
That seems like it would work, but it would mean we can't mark _GlobalMapValues as ReadOnly.
The quick compiler CL prototype I wrote that passes ./all.bash basically allows ONAME in n.Right.Op for map init elements and sets the values array as initKindDynamic and does not set the ReadOnly flag when variables are encountered. (Its on another computer) So it is the same as initializing an array of variables as elements.
These two places need change: https://github.com/golang/go/blob/a81bc8e8254d01cd442a5684801d8d2dbc553694/src/cmd/compile/internal/gc/sinit.go#L773 https://github.com/golang/go/blob/a81bc8e8254d01cd442a5684801d8d2dbc553694/src/cmd/compile/internal/gc/order.go#L1276
I see. Thanks.
I think that should work, but I'm a little hesitant about not being able to mark the memory as Readonly. Once memory is modified by a process, it can't be freed by the OS until the process exits (or until it indicates otherwise with madvise(2)
, which has page-granularity), whereas code and readonly data only used for process initialization can be flushed once they're no longer needed. In that sense, .data memory is somewhat more expensive than .text or .rodata.
I think we should try to start with just optimizing the same cases that (*InitSchedule).staticassign
can handle, since that should handle the example of package unicode.
I just realized that my explanation code above was not ideal and var _GlobalMapValues =[...]{Var1, Var2, ...}
can be local to the init function.
Iooking into the size of
unicode.init
I found that the optimization to init maps from arrays of keys and values with a for loop instead of making an mapassign call for each key/value doesnt trigger for the maps in package unicode because the values are variables and not static.The allowed types for this optimization were already expanded before and it seems there is room to go further. https://go-review.googlesource.com/c/go/+/66050/
Writing a prototype compiler CL allowing
ONAME
to also be used as a value and marking value arrays used for initialization as dynamic and not readonly (when ONAME is present) cuts the size of unicode.init roughly in half (saves 8kb currently). A hello world binary gains a 4kb size reduction.I dont see yet see a problem with side effects if all keys are static and all values are either ONAME or statics.
Unfortunately since the values in the unicode case are pointers there are still a lot of gcwritebarrier entries that bloat the init function. There seems more room to make unicode.init smaller by optimizing the init func generation.
@josharian @randall77 @bradfitz