golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
120.44k stars 17.29k forks source link

cmd/compile: static init maps should never generate code #2559

Open rsc opened 12 years ago

rsc commented 12 years ago
so that unreferenced maps can be dropped by linker
rsc commented 12 years ago

Comment 1:

Labels changed: added priority-go1, removed priority-medium.

robpike commented 12 years ago

Comment 2:

Owner changed to builder@golang.org.

rsc commented 12 years ago

Comment 4:

This is too subtle a change for Go 1 at this point.  It will have to wait.

Labels changed: added priority-later, removed priority-go1.

rsc commented 11 years ago

Comment 5:

Labels changed: added go1.1maybe.

rsc commented 11 years ago

Comment 6:

[The time for maybe has passed.]

Labels changed: removed go1.1maybe.

rsc commented 10 years ago

Comment 7:

Labels changed: added go1.2maybe.

rsc commented 10 years ago

Comment 8:

Labels changed: added feature.

robpike commented 10 years ago

Comment 9:

Labels changed: added go1.3maybe, removed go1.2maybe.

robpike commented 10 years ago

Comment 10:

Labels changed: removed go1.3maybe.

rsc commented 10 years ago

Comment 11:

Labels changed: added go1.3maybe.

rsc commented 10 years ago

Comment 12:

Labels changed: removed feature.

rsc commented 10 years ago

Comment 13:

Labels changed: added release-none, removed go1.3maybe.

rsc commented 10 years ago

Comment 14:

Labels changed: added repo-main.

remyoudompheng commented 10 years ago

Comment 15:

Is this actually possible ? The current binaries can use a different hash function
depending on the runtime system (AESNI instructions or not), and they also randomize
hash seed. In 2011 it was not the case.
gopherbot commented 5 years ago

Change https://golang.org/cl/127075 mentions this issue: html: lazily populate Unescape tables

aclements commented 5 years ago

This may in fact not be possible because the hash function and even the hash seed can be selected at runtime. However, even if we're generating init code for maps, the linker could still understand that that init code exists solely to populate that map, so if the map is unreferenced, the init code could be eliminated, too.

bradfitz commented 5 years ago

@aclements, I assumed we'd still generate the code, but make the linker a bit smarter and say that each map's initialization code would only be included in the final binary if the map's variable was reachable.

But there's also a special case in which the hash seed doesn't matter for attackability: read-only maps. As an example, look at the three predeclared* maps at the end of src/go/doc/reader.go:

https://golang.org/src/go/doc/reader.go

They're only ever read from:

$ git grep -E 'predeclared\w+\['
example.go:             if predeclaredTypes[n] || predeclaredConstants[n] || predeclaredFuncs[n] {
reader.go:              predeclared := predeclaredTypes[t.name]
reader.go:      return predeclaredTypes[s] || predeclaredFuncs[s] || predeclaredConstants[s]

So we could lay those out in read-only memory as tightly as we want if the hashmap code (delegating to an alternate implementation?) knows how to read them.

I've been on a mini crusade trying to reduce init-time CPU & allocs in #26775 and having the compiler+linker help out here would be nicer than making ugly code work around it.

/cc @randall77

aclements commented 5 years ago

So we could lay those out in read-only memory as tightly as we want if the hashmap code (delegating to an alternate implementation?) knows how to read them.

That's true. It would need to delegate to an alternate hash function, at least, which didn't use the per-process seed and didn't change based on hardware instruction support (dare I say "perfect hashing"?).

The compiler would only be able to detect unexported read-only maps, of course, but that's probably good enough.

josharian commented 5 years ago

The compiler would only be able to detect unexported read-only maps, of course, but that's probably good enough.

If we're doing this, we may as well not use the existing map implementation at all; we control all access sites, so we can use any alternative optimized implementation.

The ideal implementation probably varies significantly with the details of the data. Perhaps it is perfect hashing, perhaps it is linear or binary search over an array, perhaps it is a trie, perhaps it is a sync.Once-protected regular map. It could also perhaps populate gigantic maps incrementally by sharding based on key.

We could prototype this by writing a 'go generate' tool that converts a regular map into a read-only map (with access via a Get function). An initial implementation could just use sync.Once, and then improve from there. This would be useful for #26775 as well; instead of manually sync.Once-ing a bunch of maps, we can improve the tool and then regenerate everything that is based on it. If this turned out to be useful, and we still felt the need, we could take the tuned generation tool and bring it into the toolchain.

kevinburke commented 5 years ago

We could prototype this by writing a 'go generate' tool that converts a regular map into a read-only map (with access via a Get function).

I'd like to take a crack at this but it might take me a week.

robpike commented 5 years ago

This sounds like a lot of added technology, code, and complexity. It doesn't seem worth it to me. Yes, there is a lot of code generated to create maps at run time. But it's straightforward and easy to understand. Adding a whole extra set of map implementations to eliminate is, in my opinion, misjudged.

aclements commented 5 years ago

However, even if we're generating init code for maps, the linker could still understand that that init code exists solely to populate that map, so if the map is unreferenced, the init code could be eliminated, too.

Going back to this idea, @thanm and I were discussing a similar issue today and came up with a possible clean approach. One way of viewing the problem is that we have a chain of relocations from package to init function to the global map variable, which means we can't eliminate the global map variable even if this is the only reference to it.

We could twist this reference graph: make the global map variable have a no-op relocation to its init function and make the relocation from the package to the init function weak. This way, if the global variable is referenced, it will make its init function reachable, which would in turn resolve the weak reference to the init function so it gets called.

To my knowledge, we don't have a notion of weak relocations in the linker right now, but this is at least a well-defined and general-purpose mechanism we could add. Then the linker doesn't need to know anything special about maps or init functions; it just becomes another relocation to resolve.

aclements commented 5 years ago

However, even if we're generating init code for maps, the linker could still understand that that init code exists solely to populate that map, so if the map is unreferenced, the init code could be eliminated, too.

I had another thought on this approach. If we were to solve #14840 by detecting pure functions (which has a lot of other benefits) and marking variables that are initialized by calling a pure function as dead-codable, then we could eliminate initializers of unused maps by simply rewriting these map initializations as

var globalMap = compilerGeneratedFunction()

Even if we don't automatically detect pure functions, we could simply mark the generated function as pure (there are other uses for well-known pure functions, e.g., #22971) and do the deadcode elimination from there.

bradfitz commented 5 years ago

@aclements, that's basically what I said in https://github.com/golang/go/issues/14840#issuecomment-197542441 (which you linked to), right?

aclements commented 5 years ago

@bradfitz, it's almost the same, but with a subtle but important difference. The way I read your suggestion was to use the current construction and detect that init is side-effect-free:

var globalMap map[...]...
func init() {
  globalMap = ...
}

But init is not side-effect-free. In order to eliminate init, we'd have to detect that init's only side-effect is initializing globalMap and that nothing could have read or assigned to globalMap before init ran. If init is written by the compiler, then it can know that a priori, which is perhaps fine, but it's not about pure functions at that point.

OTOH, if the construction is

var globalMap = createGlobalMap()
func createGlobalMap() map[...]... {
  return ...
}

Then createGlobalMap is genuinely side-effect-free, which makes reasoning about dead-code elimination much easier.

bradfitz commented 5 years ago

Yeah, I was assuming there would be some new association between an init and what it initialized to make the linker's job easier.

Because even in your code example,

var globalMap = createGlobalMap()
func createGlobalMap() map[...]... {
  return ...
}

Isn't that really:

var globalMap map[K]V
func init() {
    globalMap = createGlobalMap()
}
func createGlobalMap() map[K]V {
  return ...
}

... and that init.15 or whatever I assume would need some metadata associated with it in the object file for the linker to have an dependency edge with globalMap, no?

In any case, exciting.

bradfitz commented 5 years ago

@randall77 @aclements @griesemer @ianlancetaylor, I think we should prioritize this for Go 1.13. Our binary sizes have only grown in the past 7 years. It might be time.

mvdan commented 5 years ago

I also think this could help with #29382. Lots of packages use static global maps, which I think is reasonable. However, this results in cmd/go's init cost roughly being 8% runtime.mapassign_faststr, 4% runtime.makeBucketArray, 2% runtime.mapassign_fast32, and 3% runtime.makemap, among others.

I'm not sure if the consensus is to still generate code for these global maps, but even avoiding calling the code in some unused cases would help noticeably, I think.

aclements commented 5 years ago

Was chatting with @rsc about this. There are several technical reasons why it's hard to statically generate a fully-functioning runtime map. Above we talked a bit about statically laying out just read-only maps, but that's limited by statically proving a map is read-only. Russ and I came up with a variation of this that's more practical: statically lay out maps for read-only use, but dynamically promote them to regular runtime maps if they are written to. The static layout could be the runtime's map structure (with a fixed hash and seed), but it would be better if it were something really simple. On the first write, the runtime would acquire a lock, dump the map's contents into the runtime's regular map structure, and point the map to the newly allocated map.

This could be combined with or done as follow-up to weak init functions.

seebs commented 5 years ago

clearly, you need to make maps a valid form of constant

const predeclaredTypes = map[...] {...}

then you have an unambiguous compiler hint that the map must be computable at compile time, and can be created in advance, have no init code, and not be modifiable.

(i am not yet sure how much i'm joking and how much that's an actual lucid idea. someone implement it, and if it works out, then i was totally serious, and if it's unusable, i was joking.)

ianlancetaylor commented 5 years ago

@seebs That is #6386.

seebs commented 5 years ago

That looks well-written enough that I must have been serious because it sounds like a good idea now. :)

josharian commented 5 years ago

The solution sketched by @aclements is nice.

it would be better if it were something really simple

Agreed. The question is how. This gets at something i have long wanted (and @bradfitz too, I believe), which is the ability to have multiple underlying map implementations, and switch between them. This is hard to accomplish without either adding new indirections or growing the size of a map structure.

Also worth noting is that all uses of maps end up paying the cost of the “is this the first write to a readonly map” check. Probably not a dealbreaker, but maps have been optimized to the point that individual checks like this are noticeable.

aclements commented 5 years ago

Also worth noting is that all uses of maps end up paying the cost of the “is this the first write to a readonly map” check. Probably not a dealbreaker, but maps have been optimized to the point that individual checks like this are noticeable.

In this particular case, i think we could put a bit in h.flags. I think all of the operations that matter for this already always check h.flags (e.g., for hashWriting). We could make that check for other flags at the same time at no extra cost, and then check more specific flags.

josharian commented 5 years ago

I have a working toy prototype of using multiple underlying map implementations. I now also have a decent sense for where the challenges are and what the options are. I will write something up (new issue? golang-dev?) soonish.

josharian commented 5 years ago

Apologies for the delay in sharing my prototype and notes. It has taken more cleanup than anticipated. I'm still working on it. If folks like the direction, I might be game to take on doing an initial readonly-promote-to-readwrite map implementation, but I want to start with discussing the prototype.

gopherbot commented 5 years ago

Change https://golang.org/cl/170317 mentions this issue: net/textproto: simplify commonHeader initialization

bcmills commented 5 years ago

we talked a bit about statically laying out just read-only maps, but that's limited by statically proving a map is read-only.

Statically-initialized maps are necessarily stored in package variables, and in my experience most package variables of map types are unexported and trivially non-aliasing (such as the one in CL 170317).

It seems like it would actually be pretty straightforward to prove that such maps are read-only, at least with the (IMO reasonable) assumption that they are not exposed and mutated via a //go:linkname directive.

aclements commented 5 years ago

@bcmills, given that we would need two map implementations anyway to have statically initialized maps, is there a disadvantage to dynamically upgrading a read-only map to a read/write map on first write?

(I think the canonical example of exported, static maps is the unicode package, though you're right that most are unexported, at least in std.)

bcmills commented 5 years ago

is there a disadvantage to dynamically upgrading a read-only map to a read/write map on first write?

The branch for the dynamic mode-check is either hard to predict, or needs to consume resources in the branch predictor specific to each map (or map access site), which could slow down map accesses throughout the program.

(I suppose we could only even check the branch for maps that may be in read-only mode, but if we do enough analysis to figure that out, then it seems like we've done nearly enough of the work to eliminate the branch entirely.)

gopherbot commented 5 years ago

Change https://golang.org/cl/170280 mentions this issue: runtime: add toy specialized map implementation

bcmills commented 5 years ago

There are several technical reasons why it's hard to statically generate a fully-functioning runtime map.

Could you give a little more detail about those reasons? Adding a whole separate map implementation seems pretty gnarly too.

The difficulties I would expect to encounter are:

Am I missing some others?

randall77 commented 5 years ago

Runtime hash salting is the only one I can think of. The compiler already understands most of the rest of the map layout.

Note that the per-map salt is easy to fix. It's the per-process salt that's the challenge.

aclements commented 5 years ago

Also we select the whole hash function based on what the hardware supports at runtime (e.g., AES or not). We'd have to use the lowest-common-denominator hash function and record that that's what's being used for a given map. Then we're also stuck with that until, say, we grow the map.

josharian commented 5 years ago

@randall77 @aclements any feedback about the approach sketched in https://golang.org/cl/170280? Particularly the treatment of maps and map iterators as union data types. (Sorry, it is a lot of words and code.) I’m considering starting on readonly maps following that CL as a very rough map.

randall77 commented 5 years ago

I looked through that CL. I'm worried about adding so much complexity. There's going to be an awful lot of new code that's very tricky to get right, and test. Iterators, reflection, converting between representations, etc.

I'm much more inclined to detect that a map is readonly at compile time, and somehow just special case the hash function so that we can build the map at compile time. That seems a much smaller change to get the same effect.

Of course, your change is much more general, and might make small runtime-built maps faster also. In addition, it does solve the exported-but-never-modified map problem, which we can't do at compile time. But we also have to switch on the map type for every call, and I think most of the cases in the stdlib at least are unexported global maps (TODO: anyone have actual data on this?).

In my mind, the decision comes down to how much Josh's general map setup can buy us beyond just improving startup times. If it's just startup times, I don't think it is worth it - there's a much simpler way to achieve that. If there's more to be had, and we can demonstrate where that improvement might come from, then I'm still open to the idea.

rojer commented 4 years ago

I just found a pretty bad case where this manifests - an unused map in a widely used package completely disabled DCE (#36021). Before going all the way to statically-defined maps, implementing @aclements suggestion in https://github.com/golang/go/issues/2559#issuecomment-441465386 would be very useful.

randall77 commented 4 years ago

@rasky was looking into implementing something like @aclements 's solution. Any progress?

gopherbot commented 4 years ago

Change https://golang.org/cl/210284 mentions this issue: text/template: avoid a global map to help the linker deadcode eliminate

jeremyfaller commented 3 years ago

I had a little time this morning, and did a little poking at this -- If we use @aclements approach to use weak relocations, and invert the dependency chain, we should probably also split up the init calls. As of 1.15, multiple maps in a package result in a single init being generated. If we don't split up the inits, a reference to a single global map in a package will pull all the rest in.

Still, I think @aclements approach will yield wins.