rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
97.22k stars 12.56k forks source link

Add an easy way to get details of a type's size and layout in memory #37623

Closed nnethercote closed 7 years ago

nnethercote commented 7 years ago

Multiple times recently, while looking at #36799, I have wanted to know exactly how a type is laid out in memory, including padding and unused bytes. Particular for enums where the variants can vary significantly in size, and shrinking large variants can be a win.

I have successfully shrunk some types (#37445, #37577). I manually worked out their memory layouts by using a combination of println! statements and staring at the code. I can usually work out the layout, though not always. (E.g. I know that Rvalue is 152 bytes on 64-bit platforms, and that the BinaryOp and CheckedBinaryOp variants are the largest, but I haven't yet worked out why they take that much space because the nesting of the types within those variants is deep and non-obvious.)

But it is a huge pain to do it manually, as well as unreliable. It would be lovely if there was an easy way to get this information. Something like std::intrinsics::type_name, perhaps. Such a feature would have wide utility for developers of many Rust programs.

sanxiyn commented 7 years ago

variant-size-differences (default allow) lint is intended for this.

leonardo-m commented 7 years ago

I suggest you to show exactly what usage/API/output the feature you look for is meant to have.

nagisa commented 7 years ago

We could add a -Z flag for this – we already have a number of *-stats flags which appeared for similar purposes.

michaelwoerister commented 7 years ago

I was thinking about writing a DWARF-based tool for this. DWARF contains type-layouts, so you can implement this with an external tool. Doing it directly in the compiler might be less work though.

nnethercote commented 7 years ago

variant-size-differences (default allow) lint is intended for this.

AIUI that only points out when an enum has significantly size differences between variants. Which is useful, but I'm asking for a lot more detail.

I suggest you to show exactly what usage/API/output the feature you look for is meant to have.

For types like this:

struct S {
  f: bool,
  g: i32,
}

enum E {
  A(i64, i32)
  B(S)
}

I'd like output something like this:

struct S: 8 bytes
- field `f: bool`: 1 bit used, 31 bits padding
- field `g: i32`: 4 bytes used, 0 bytes padding

enum E: 24 bytes
- discriminant: 8 bytes
- variant `A`: 16 bytes (largest variant)
  - field `0: i64`: 8 bytes used, 0 bytes padding
  - field `1: i32`: 4 bytes used, 4 bytes padding
- variant B: 8 bytes
  - field `0: S`: 8 bytes used, 8 bytes unused

There's a lot of flexibility about the invocation and exact presentation -- e.g. would it do this for all types, or just requested types? -- but this gives a good idea. (I may also have gotten some of those numbers wrong.)

Even better would be if it could give suggestions on more efficient orderings. E.g. if the fields in E::A are switched does the discriminant shrink from 8 bytes to 4? I'm not sure about that, but again, you get the ida.

Mark-Simulacrum commented 7 years ago

Even better would be if it could give suggestions on more efficient orderings. E.g. if the fields in E::A are switched does the discriminant shrink from 8 bytes to 4? I'm not sure about that, but again, you get the ida.

This should hopefully "soon" become unnecessary thanks to @camlorn's work with struct (and I assume enum) field re-ordering. Most non repr(rust) structs/enums probably cannot be trivially reordered anyway, and so this sort of information wouldn't be helpful.


I would suggest that this information be included in the save-analysis API, perhaps? I think that's the primary information output method from rustc right now, and it can save it in a format that can the be queried for specific types/variants and printed in whatever way is preferred.

nagisa commented 7 years ago

Sure it would be. One way to reduce size of a enum for example, if one variant is significantly larger than any other one, is to introduce extra indirection. Field reordering doesn't help here and that sort of optimisation I believe is what Nethercote is doing in the rustc internals the most.

On Nov 10, 2016 6:38 AM, "Mark Simulacrum" notifications@github.com wrote:

Even better would be if it could give suggestions on more efficient orderings. E.g. if the fields in E::A are switched does the discriminant shrink from 8 bytes to 4? I'm not sure about that, but again, you get the ida.

This should hopefully "soon" become unnecessary thanks to @camlorn https://github.com/camlorn's work with struct (and I assume enum) field re-ordering. Most non repr(rust) structs/enums probably cannot be trivially reordered anyway, and so this sort of information wouldn't be

helpful.

I would suggest that this information be included in the save-analysis API, perhaps? I think that's the primary information output method from rustc right now, and it can save it in a format that can the be queried for specific types/variants and printed in whatever way is preferred.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/rust-lang/rust/issues/37623#issuecomment-259600014, or mute the thread https://github.com/notifications/unsubscribe-auth/AApc0tG1Y8KDapGMfYWEeUCKNSaoGT56ks5q8p_KgaJpZM4Kqz_H .

ahicks92 commented 7 years ago

Yeah, the optimizations I'm working on are provably optimal as far as @eddyb and myself could determine, though neither of us came up with a full proof. The only way to do better is if you can interleave structs, but that's got major implementation challenges and isn't worth it in my opinion. What I'm already doing is proving to be hard enough, no need to go make major modifications to LLVM.

nnethercote commented 7 years ago

Automatic struct field reordering to reduce padding sounds very useful! And it just adds to the usefulness of the feature this PR is requesting, because it would let you see what field order the compiler chose.

sanxiyn commented 7 years ago

cc #37770.

pnkfelix commented 7 years ago

@nnethercote can we close this since #37770 has landed? Or do we want to wait until something more stable than a -Z flag (which sounds RFC worthy) has been proposed and landed?

nnethercote commented 7 years ago

37770 is very much what I had in mind. Thank you for doing it.