SDL-Hercules-390 / hyperion

The SDL Hercules 4.x Hyperion version of the System/370, ESA/390, and z/Architecture Emulator
Other
237 stars 89 forks source link

question about the __SSE2__ intrinsics ( x86intrin.h ) #639

Closed atncsj6h closed 4 months ago

atncsj6h commented 5 months ago

how crucial is from a performance point of view the SSE2 stuff ?

found this .... https://github.com/DLTcollab/sse2neon

sse2neon A C/C++ header file that converts Intel SSE intrinsics to Arm/Aarch64 NEON intrinsics.

is it worth investigating ?

out of curiosity I downloaded it and the test suite was successfull

SSE2NEONTest Complete! Passed: 519 Failed: 0 Ignored: 4 Coverage rate: 99.23%

Fish-Git commented 4 months ago

how crucial is from a performance point of view the SSE2 stuff ?

I really have no idea since I am unfamiliar with x86/x64 instructions/assembly, so I have no idea how many SSE2 instructions end up being compiled into Hercules.

If you know which x86/x64 instructions are SSE2 instructions, I suppose you could compile Hercules with the option to produce assembler output, and then examine the assembler output to see how many SSE2 instructions the compiler decided to use.

Then I suppose you could compile Hercules again with some option that requests that SSE2 instructions not be used, and then compare the performance of the two. That should answer your question.

Without having to go through all that effort however, based on just what Wikipedia says about SSE2:

I would venture a guess that SSE2 support is very important for performance.

Unless someone else decides to jump into this discussion with their own thoughts on the subject within the next day or two, I'm going to close this issue due to lack of interest.

atncsj6h commented 4 months ago

hi Fish! after one month without any interest, no need to wait, go ahead and close it e

wrljet commented 4 months ago

There is interest. Can't we leave it open?

Fish-Git commented 4 months ago

There is interest. Can't we leave it open?

Sure. I'm just curious as to why? I mean, virtually all modern x86/x64 processors support SSE2 these days, yes? So virtually all compilers also support them too, yes? Agreed? So what is there to discuss really? Whether we should allow compilers to use them?? That seems silly, doesn't it? Of course we should! What we shouldn't do is anything that would prevent their use! Yes? So again, what's to discuss? The question has essentially been answered: Yes, SSE2 instructions (intrinsics) are important to performance. HOW important (to performance) they are is largely immaterial IMO, They certainly do not harm performance that's for sure. And there's not much (if anything!) that we can do to prevent a compiler from using them, so again, what's to discuss?

I'm confused! Why do we need to keep this issue open? Please explain.

atncsj6h commented 4 months ago

sure. I'm just curious as to why? I mean, virtually all modern x86/x64 processors support SSE2 these days, yes?

we are not talking about amd/x86 processors... we are talking about intrtinsics provided by ARM_NEON equivalent to the amd/x86 ones

atncsj6h commented 4 months ago

forgive me for the poor presentation of the issue in my initial post ( I am just a poor old E3L )

I meant to open a discussion about ...

if SSE2 are giving a performance boost for the amd64/x86_64 then the ARM_NEON equivalents might do the same for the arm/arm64

the question is a moot point now ... I implemented the ARM_NEON ifrastructure and now all I have to do is to measure the performange gain ( if any ) pretty intrusive change , the softfloat types have to be amended to avoid clashes with the types defined in the arm_neon.h header

I will close the issue now, no reason to keep it open