genodelabs / genode-world

Collection of community-maintained components for Genode
Other
44 stars 46 forks source link

seoul: undefined instruction issues with g++-12 on qemu #329

Closed atopia closed 1 year ago

atopia commented 1 year ago

My tests with the seoul-auto script trigger an alignment issue in seoul: make -C build/x86_64 BOARD=pc run/seoul-auto

With KERNEL=nova:

[init -> seoul] VMM: #   ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Warning: unresolvable exception 6, pd 'init -> seoul', thread 'vCPU EP 0', cpu 0, ip=0x107e4e2 sp=0x403fe570 bp=0x1ecc0 no signal handler
[ 0] Killed EC:0xffffffff812f41c0 SC:0xffffffff81323040 V:0xfc CR0:0x10 CR3:0x0 CR4:0x0 (IPC Abort)

With my hw branch and KERNEL=hw:

[init -> seoul] VMM: #   ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Kernel: init -> seoul -> vCPU EP 0: undefined instruction at ip=0x107e4e2

objdump:

 contrib/seoul-0b454897db50b707b2c18e7ee342a00d1efb8c07/src/app/seoul/model/satadrive.cc:105
   107e490:       48 bf 00 ea 0a 01 00    movabs $0x10aea00,%rdi
   107e497:       00 00 00
   107e49a:       66 0f 6f 2f             movdqa (%rdi),%xmm5
   107e49e:       49 8d 92 00 01 00 00    lea    0x100(%r10),%rdx
   107e4a5:       48 bf 10 ea 0a 01 00    movabs $0x10aea10,%rdi
   107e4ac:       00 00 00
   107e4af:       66 0f 6f 27             movdqa (%rdi),%xmm4
   107e4b3:       49 8d b2 28 01 00 00    lea    0x128(%r10),%rsi
   107e4ba:       48 bf 20 ea 0a 01 00    movabs $0x10aea20,%rdi
   107e4c1:       00 00 00
   107e4c4:       66 0f 6f 1f             movdqa (%rdi),%xmm3
   107e4c8:       66 0f 6e 0a             movd   (%rdx),%xmm1
   107e4cc:       66 45 0f 6f c1          movdqa %xmm9,%xmm8
   107e4d1:       48 83 c2 08             add    $0x8,%rdx
   107e4d5:       48 83 c1 08             add    $0x8,%rcx
   107e4d9:       66 0f 6e 52 fc          movd   -0x4(%rdx),%xmm2
   107e4de:       66 0f 6f c1             movdqa %xmm1,%xmm0
*  107e4e2:       66 0f 38 00 cc          pshufb %xmm4,%xmm1
   107e4e7:       66 0f 6f fa             movdqa %xmm2,%xmm7
   107e4eb:       66 0f 38 00 c6          pshufb %xmm6,%xmm0
   107e4f0:       66 0f 38 00 d3          pshufb %xmm3,%xmm2
   107e4f5:       66 0f eb ca             por    %xmm2,%xmm1
   107e4f9:       66 0f 38 00 fd          pshufb %xmm5,%xmm7
   107e4fe:       66 0f eb c7             por    %xmm7,%xmm0
   107e502:       66 44 0f 64 c0          pcmpgtb %xmm0,%xmm8
   107e507:       66 0f 6f f8             movdqa %xmm0,%xmm7
   107e50b:       66 44 0f 6f d9          movdqa %xmm1,%xmm11
   107e510:       66 41 0f 60 f8          punpcklbw %xmm8,%xmm7
   107e515:       66 0f 6f d7             movdqa %xmm7,%xmm2
   107e519:       66 41 0f 6f f9          movdqa %xmm9,%xmm7
   107e51e:       66 41 0f 60 c0          punpcklbw %xmm8,%xmm0
   107e523:       66 0f 64 f9             pcmpgtb %xmm1,%xmm7
   107e527:       66 0f 70 c0 41          pshufd $0x41,%xmm0,%xmm0
   107e52c:       66 0f 71 f2 08          psllw  $0x8,%xmm2
   107e531:       66 0f 71 f0 08          psllw  $0x8,%xmm0
   107e536:       66 44 0f 60 df          punpcklbw %xmm7,%xmm11
   107e53b:       66 45 0f 6f d3          movdqa %xmm11,%xmm10
   107e540:       66 0f 60 cf             punpcklbw %xmm7,%xmm1
   107e544:       66 0f 70 c9 41          pshufd $0x41,%xmm1,%xmm1
   107e549:       66 41 0f eb d2          por    %xmm10,%xmm2
   107e54e:       66 0f eb c1             por    %xmm1,%xmm0
   107e552:       66 0f 7e 51 2e          movd   %xmm2,0x2e(%rcx)
   107e557:       66 0f 7e 41 32          movd   %xmm0,0x32(%rcx)

C++ code:

  104     for (unsigned i=0; i<20; i++)
  105       identify[27+i] = uint16(_params.name[2*i] << 8 | _params.name[2*i+1]);

This looks like an SSSE3 alignment issue but I'm unsure what the the best resolution for this case would be, as far as I've followed the discussion in genodelabs/genode#4827, previous issues involved bootstrap or ARM?

src/app/seoul/target.mk enables SSSE3 via the -march=core2 switch.

atopia commented 1 year ago

@alex-ab volunteered to look into this (thanks, Alex!). As it seems more a tooling than a genuine seoul issue, maybe conclusions from the other alignment issues apply, @chelmuth ?

chelmuth commented 1 year ago

I assume you're referring to the following code.

https://github.com/alex-ab/seoul/blob/47c73dc55004523e0ac79267ad05e3c1563e8a78/model/satadrive.cc#L101-L107

If no idea how to adapt the code comes up, we could just disable vector-loop optimization for this compilation unit or function...

ssumpf commented 1 year ago

Or align https://github.com/alex-ab/seoul/blob/47c73dc55004523e0ac79267ad05e3c1563e8a78/model/satadrive.cc#LL322C4-L322C33 identify to 16 Byte stack address?

chelmuth commented 1 year ago

Or align https://github.com/alex-ab/seoul/blob/47c73dc55004523e0ac79267ad05e3c1563e8a78/model/satadrive.cc#LL322C4-L322C33 identify to 16 Byte stack address?

Good point, but I myself I don't feel competent when it comes to making alignment constraints and properties clear to the compiler... still hope to learn more.

ssumpf commented 1 year ago

Or align https://github.com/alex-ab/seoul/blob/47c73dc55004523e0ac79267ad05e3c1563e8a78/model/satadrive.cc#LL322C4-L322C33 identify to 16 Byte stack address?

Good point, but I myself I don't feel competent when it comes to making alignment constraints and properties clear to the compiler... still hope to learn more.

__builtin_alloca maybe?

alex-ab commented 1 year ago

I can trigger the fault, but the address of identify is 16 bit aligned (0x403fe5e0). The fault code 6 means undefined instruction. If it would be an alignment fault it would be 13. The faulting instruction is pshufb. According to my investigation, the cpu model we use for qemu+svm (-cpu phenom) does not support this feature at all (AMD Phenom has no pshufb support). Switching to some newer model "-cpu EPYC", seoul-auto works for me. I would suggest to switch to a an newer qemu cpu model for SVM virtualization, what do you think ?

alex-ab commented 1 year ago

According to https://qemu-project.gitlab.io/qemu/system/qemu-cpu-models.html "-cpu phenom" is "non-recommended x86 CPUs".

atopia commented 1 year ago

Good catch! I didn't know what runtime error to expect from alignment issues, hence I attributed the effect of the novel use of SSSE3 instructions to the alignment instead of to the SIMD extension as such (especially when they were introduced in Core2 on Intel). I can confirm that changing the emulated CPU to EPYC fixes the error for me.

chelmuth commented 1 year ago

Excellent, would you mind to also check genodelabs/genode for improper qemu -cpu configurations?

alex-ab commented 1 year ago

Excellent, would you mind to also check genodelabs/genode for improper qemu -cpu configurations?

No problem, commit on genode staging adjusts all phenom occurrences

chelmuth commented 1 year ago

Last night we got the error again despite df651ac1d388897cff14c49eaa5bde01d8751cc7.

Sorry, looked into the wrong log file.