Closed pikacic closed 3 years ago
Thanks a lot for looking info this @pikacic !
I would really like it if we could use more meaningful names like x86_64-avx
, x86_64-avx2
, etc., although I understand this doesn't come from thin air... Can we do both though, x86_64-v2
and x86_64-avx
which are equivalent?
@boegel we don't currently have aliases, and the names are pretty much set by gcc and clang -- it's in the ABI spec now. I think we can add aliases later if we grow a feature. I want to do that so that so that rome
, milan
, genoa
eventually map to zen2
, zen3
, zen4
, etc.
@boegel, I agree with you, but, as @tgamblin said, it's not my choice.
About the aliasing, at the current state we cannot detect to be on a x86-64-v3
or on a haswell
processor has they have strictly identical features lists, so I do not know how archspec
detection is going to behave.
OTOH x86-64-v3
is a bit less than haswell
... and a bit more, in the sense that some of the features in a haswell
CPU are not listed (see my comment on cx16
and lahf_lm
). I'd be glad if you could suggest how to proceed.
I think we should break the tie by detecting the actual architecture. In some cases the feature lists are tweaked so that we can detect on cloud nodes and VMs, where certain features are disabled. But we still know with very strong likelihood that, e.g., the chip is a haswell
.
x86-64-v3
is virtual -- i.e. there isn't actually a microarchitecture called that, and I think users will want detection to return the actual chip name. We can still say that x86-64-v3
is compatible with haswell
(but shouldn't do vice versa due to the instructions you mention).
I think the above actually works out pretty well in the implementation -- arch spec will take the "highest" architecture in the DAG that has the features of the host. If haswell
inherits from x86-64-v3
, it'll always be preferred in detection.
@boegel we don't currently have aliases, and the names are pretty much set by gcc and clang -- it's in the ABI spec now. I think we can add aliases later if we grow a feature. I want to do that so that so that
rome
,milan
,genoa
eventually map tozen2
,zen3
,zen4
, etc.
OK, x86_64-avx
(or just avx
) as an alias for a virtual microarchitecture makes sense.
Makes sense, x86_64-avx
I generally prefer dashes to _
and almost asked for it in this PR, but we made a conscious decision to use _
in arch spec so that the names would fit unambiguously in hyphenated target triplet strings (e.g. x86_64_avx-linux-gnu
)
@alalazo is there more left to do here? It would be good to have more tests, but if this works with archspec right now I'm tempted to say we should merge it. Can you try it out?
@tgamblin Will do by tomorrow. Do we have any preference on this https://github.com/archspec/archspec-json/pull/31#discussion_r630815172 ?
Basically right now we have:
nehalem
|
core2
|
[ ... ]
This PR makes it like:
nehalem
|
x86-64-v2 (virtual)
|
core2
|
[ ... ]
I think it would be better to branch in like:
nehalem
| \
core2 x86-64-v2 (virtual)
|
[ ... ]
@alalazo, something like
nehalem
| \
core2 x86-64-v2 (virtual)
|
[ ... ]
hides the fact that x86-64-v2
can run core2
binaries. I would prefer
nehalem
| \
| x86-64-v2 (virtual)
| /
core2
|
[ ... ]
@pikacic @tgamblin I'm reading the documents more closely and my understanding is that the modeling here is currently incorrect, since it adds properties to what is in the definitions of the virtual levels. For instance in this PR we have x86_64_v3
inherit from ivybridge
but that seems wrong since in the virtual levels I don't see mentioned, for instance, AES
or FSGSBASE
or RDRAND
etc. that ivybridge
has according to GCC docs.
As per https://github.com/archspec/archspec-json/pull/31#issuecomment-843416872 I think a better modeling would be to:
I suggest point 2 specifically since, by reading the documents, it doesn't seem the virtual levels are only for Intel so it would be good to keep them separated from the vendors - since they could also branch into AMD procs?
hides the fact that x86-64-v2 can run core2 binaries. I would prefer
That would be true for core2
but not for other uarchs, if my understanding is correct. So I would not special case core2
for the reason explained above.
Let me know if I missed or misunderstood any point.
@alalazo, I think now I understand a bit better the archspec architecture and I agree completely with your suggestion. Give me a few minutes to implement it.
It took me a bit longer than expected because I tweaked a bit the flags to match a bit better the instructions enabled by the uarch levels.
With the latest change (if I understand correctly how this all works) the x86-64-vX
architectures will not be matched when detecting the CPU uarch (somehow as before). The DAG is such that when a binary is built for x86-64-v3
we can run it on haswell
(or more recent) CPUs.
I also declared the compatibility of AMD CPUs with the new microarchitecture levels (deduced from GCC docs).
@alalazo, @tgamblin, can you check, please?
I'll have a look at this asap and try it in archspec
. One minor request for the layout of the JSON file: can we have the definitions for the virtual levels directly below the x86_64
definition?
I couldn't decide if to have them grouped at the beginning or at the end, so I kept them interleaved. I'll move them right away... and let me know if you want me to squash the commits.
I'll move them right away... and let me know if you want me to squash the commits.
No worries. We'll squash merge the PR in the end :slightly_smiling_face:
Thanks @pikacic I'll double check the entries and add a draft PR in archspec. I'll get back here as soon as I have news.
@pikacic I think zen
can derive from x86_64_v3
directly instead of x86_64
. Can you please double check?
@alalazo, you are right... I missed that zen
was not compatible with excavator
.
I also changed the compatibility list of bulldozer
, as ["x86_64", "x86_64_v2"]
was not particularly useful.
@pikacic Thanks! This looks great to me!
For reference, once this is merged I'll update https://github.com/archspec/archspec/pull/51
LGTM!
Thanks @pikacic!
This introduces definitions for the microarchitecture levels:
x86-64-v2
(basically an alias tonehalem
with generic tuning)x86-64-v3
(alias tohaswell
with generic tuning)x86-64-v4
(alias toskylake_avx512
with generic tuning)I also implemented equivalent compiler flags for gcc < 11.1 and clang < 12.0.
It's my first PR pm archspec, so I have some doubts about how to tune the compiler version ranges and the features list.
Resolves archspec/archspec-json#30