archspec / archspec-json

Other
20 stars 32 forks source link

Add definition of Neoverse N2 features #105

Closed paolotricerri closed 1 month ago

paolotricerri commented 3 months ago

This patch should address https://github.com/archspec/archspec-json/issues/97

paolotricerri commented 3 months ago

How can I invite a colleague (username @dslarm) to the review?

alalazo commented 3 months ago

@paolotricerri Anybody can leave a review, as far as I know - so should be fine to just ping them. For the "Reviewers" list in the PR I guess you need to have more rights on the upstream repository.

alalazo commented 3 months ago

Can you give a more descriptive title, and explain in the description how this solves the issue?

In the issue it is stated that:

the issue here is that the neoverse_n2 and neoverse_v2 have exactly the same features - so matching by closest feature list would not be right.

but from this changeset I see that neoverse_n2 $\subset$ neoverse_v2, with the features sm3, sm4, and svesm4 which are only in v2. Does this mean the claim above is incorrect?

EDIT: I see now #106 If that is merged the features are not enough to distinguish among the two, as @dslarm claims. How does this PR address #97 then ?

paolotricerri commented 2 months ago

@alalazo I agree that the features across neoverse_n2 and neoverse_v2 are the same (once -and if- https://github.com/archspec/archspec-json/pull/106 is merged) and so distinguishing purely on features won't be possible.

To distinguish the two architectures, I would propose to add a check based on cpuid (I think this is where the addition of cpuid comes into play from the other PRs I have opened) either in the compatibility checks for aarch64 or around the sorting function in host(). I prefer option 1 because it is something only for arch64 and because if cpuid is present then the list of candidates returned by compatibility_check inside the host() function should be a length of length 1 really, or at most a list of length 2, that is [ generic_aarch64_cpu, specific_aarch64_core_with_cpuid ] ). If cpuid is not present then we revert to the current mechanism with multiple candidates and a sorting mechanism on them. Option 2 sounds a pollution of the host() function tbh.

What do you think?

alalazo commented 1 month ago

Closing in favor of #112