archspec / archspec-json

Other
20 stars 33 forks source link

Add support for microarchitecture levels #31

Closed pikacic closed 3 years ago

pikacic commented 3 years ago

This introduces definitions for the microarchitecture levels:

I also implemented equivalent compiler flags for gcc < 11.1 and clang < 12.0.

It's my first PR pm archspec, so I have some doubts about how to tune the compiler version ranges and the features list.

Resolves archspec/archspec-json#30

boegel commented 3 years ago

Thanks a lot for looking info this @pikacic !

I would really like it if we could use more meaningful names like x86_64-avx, x86_64-avx2, etc., although I understand this doesn't come from thin air... Can we do both though, x86_64-v2 and x86_64-avx which are equivalent?

tgamblin commented 3 years ago

@boegel we don't currently have aliases, and the names are pretty much set by gcc and clang -- it's in the ABI spec now. I think we can add aliases later if we grow a feature. I want to do that so that so that rome, milan, genoa eventually map to zen2, zen3, zen4, etc.

pikacic commented 3 years ago

@boegel, I agree with you, but, as @tgamblin said, it's not my choice.

About the aliasing, at the current state we cannot detect to be on a x86-64-v3 or on a haswell processor has they have strictly identical features lists, so I do not know how archspec detection is going to behave.

OTOH x86-64-v3 is a bit less than haswell... and a bit more, in the sense that some of the features in a haswell CPU are not listed (see my comment on cx16 and lahf_lm). I'd be glad if you could suggest how to proceed.

tgamblin commented 3 years ago

I think we should break the tie by detecting the actual architecture. In some cases the feature lists are tweaked so that we can detect on cloud nodes and VMs, where certain features are disabled. But we still know with very strong likelihood that, e.g., the chip is a haswell.

x86-64-v3 is virtual -- i.e. there isn't actually a microarchitecture called that, and I think users will want detection to return the actual chip name. We can still say that x86-64-v3 is compatible with haswell (but shouldn't do vice versa due to the instructions you mention).

tgamblin commented 3 years ago

I think the above actually works out pretty well in the implementation -- arch spec will take the "highest" architecture in the DAG that has the features of the host. If haswell inherits from x86-64-v3, it'll always be preferred in detection.

boegel commented 3 years ago

@boegel we don't currently have aliases, and the names are pretty much set by gcc and clang -- it's in the ABI spec now. I think we can add aliases later if we grow a feature. I want to do that so that so that rome, milan, genoa eventually map to zen2, zen3, zen4, etc.

OK, x86_64-avx (or just avx) as an alias for a virtual microarchitecture makes sense.

tgamblin commented 3 years ago

Makes sense, x86_64-avx

I generally prefer dashes to _ and almost asked for it in this PR, but we made a conscious decision to use _ in arch spec so that the names would fit unambiguously in hyphenated target triplet strings (e.g. x86_64_avx-linux-gnu)

tgamblin commented 3 years ago

@alalazo is there more left to do here? It would be good to have more tests, but if this works with archspec right now I'm tempted to say we should merge it. Can you try it out?

alalazo commented 3 years ago

@tgamblin Will do by tomorrow. Do we have any preference on this https://github.com/archspec/archspec-json/pull/31#discussion_r630815172 ?

Basically right now we have:

nehalem
  |
core2
  |
[ ... ]

This PR makes it like:

nehalem
  |
x86-64-v2 (virtual)
  |
core2
  |
[ ... ]

I think it would be better to branch in like:

nehalem
  |       \
core2   x86-64-v2 (virtual)
  |
[ ... ]
pikacic commented 3 years ago

@alalazo, something like

nehalem
  |       \
core2   x86-64-v2 (virtual)
  |
[ ... ]

hides the fact that x86-64-v2 can run core2 binaries. I would prefer

nehalem
  |     \
  |      x86-64-v2 (virtual)
  |     /
core2   
  |
[ ... ]
alalazo commented 3 years ago

@pikacic @tgamblin I'm reading the documents more closely and my understanding is that the modeling here is currently incorrect, since it adds properties to what is in the definitions of the virtual levels. For instance in this PR we have x86_64_v3 inherit from ivybridge but that seems wrong since in the virtual levels I don't see mentioned, for instance, AES or FSGSBASE or RDRAND etc. that ivybridge has according to GCC docs.

As per https://github.com/archspec/archspec-json/pull/31#issuecomment-843416872 I think a better modeling would be to:

  1. Have all the virtual levels inherit from each other (x86_64 -> x86_64_v2 -> etc.)
  2. Keep the virtual levels separated from "real" uarchs (i.e. no branches from a real uarch to a virtual level)
  3. Branch-in as suggested on the first uarch that supports that virtual level

I suggest point 2 specifically since, by reading the documents, it doesn't seem the virtual levels are only for Intel so it would be good to keep them separated from the vendors - since they could also branch into AMD procs?

hides the fact that x86-64-v2 can run core2 binaries. I would prefer

That would be true for core2 but not for other uarchs, if my understanding is correct. So I would not special case core2 for the reason explained above.

Let me know if I missed or misunderstood any point.

pikacic commented 3 years ago

@alalazo, I think now I understand a bit better the archspec architecture and I agree completely with your suggestion. Give me a few minutes to implement it.

pikacic commented 3 years ago

It took me a bit longer than expected because I tweaked a bit the flags to match a bit better the instructions enabled by the uarch levels.

With the latest change (if I understand correctly how this all works) the x86-64-vX architectures will not be matched when detecting the CPU uarch (somehow as before). The DAG is such that when a binary is built for x86-64-v3 we can run it on haswell (or more recent) CPUs.

I also declared the compatibility of AMD CPUs with the new microarchitecture levels (deduced from GCC docs).

@alalazo, @tgamblin, can you check, please?

alalazo commented 3 years ago

I'll have a look at this asap and try it in archspec. One minor request for the layout of the JSON file: can we have the definitions for the virtual levels directly below the x86_64 definition?

pikacic commented 3 years ago

I couldn't decide if to have them grouped at the beginning or at the end, so I kept them interleaved. I'll move them right away... and let me know if you want me to squash the commits.

alalazo commented 3 years ago

I'll move them right away... and let me know if you want me to squash the commits.

No worries. We'll squash merge the PR in the end :slightly_smiling_face:

alalazo commented 3 years ago

Thanks @pikacic I'll double check the entries and add a draft PR in archspec. I'll get back here as soon as I have news.

alalazo commented 3 years ago

@pikacic I think zen can derive from x86_64_v3 directly instead of x86_64. Can you please double check?

pikacic commented 3 years ago

@alalazo, you are right... I missed that zen was not compatible with excavator.

I also changed the compatibility list of bulldozer, as ["x86_64", "x86_64_v2"] was not particularly useful.

alalazo commented 3 years ago

@pikacic Thanks! This looks great to me!

alalazo commented 3 years ago

For reference, once this is merged I'll update https://github.com/archspec/archspec/pull/51

tgamblin commented 3 years ago

LGTM!

tgamblin commented 3 years ago

Thanks @pikacic!