dalek-cryptography / curve25519-dalek

A pure-Rust implementation of group operations on Ristretto and Curve25519
Other
853 stars 422 forks source link

Issues when running the avx512ifma backend #242

Open dignifiedquire opened 5 years ago

dignifiedquire commented 5 years ago

I was trying to test out the avx512ifma backend, but getting the following error

Building with

> RUSTFLAGS="-C target_feature=+avx512ifma" cargo +nightly bench --no-default-features --features=std,simd_backend
LLVM ERROR: Cannot select: 0x7f774b281888: v4i64 = X86ISD::VPMADD52L 0x7f774b2b93a8, 0x7f774b2818f0, 0x7f774b281c30
  0x7f774b2b93a8: v4i64 = X86ISD::VSRLI 0x7f774b281820, Constant:i8<51>
    0x7f774b281820: v4i64,ch = load<(dereferenceable load 32 from %ir.13)> 0x7f774b281208, 0x7f774b2817b8, undef:i64
      0x7f774b2817b8: i64 = add nuw 0x7f774b281068, Constant:i64<128>
        0x7f774b281068: i64,ch = CopyFromReg 0x7f774b379e58, Register:i64 %1
          0x7f774b281000: i64 = Register %1
        0x7f774b281750: i64 = Constant<128>
      0x7f774b2812d8: i64 = undef
    0x7f774b281270: i8 = Constant<51>
  0x7f774b2818f0: v4i64 = X86ISD::VBROADCAST 0x7f774b2bc3a8
    0x7f774b2bc3a8: i64,ch = load<(load 8 from constant-pool)> 0x7f774b379e58, 0x7f774b281d00, undef:i64
      0x7f774b281d00: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i64 19> 0
        0x7f774b281d68: i64 = TargetConstantPool<i64 19> 0
      0x7f774b2812d8: i64 = undef
  0x7f774b281c30: v4i64 = and 0x7f774b281340, 0x7f774b2b97b8
    0x7f774b281340: v4i64,ch = load<(dereferenceable load 32 from %ir.5)> 0x7f774b281208, 0x7f774b281068, undef:i64
      0x7f774b281068: i64,ch = CopyFromReg 0x7f774b379e58, Register:i64 %1
        0x7f774b281000: i64 = Register %1
      0x7f774b2812d8: i64 = undef
    0x7f774b2b97b8: v4i64 = X86ISD::VBROADCAST 0x7f774b281af8
      0x7f774b281af8: i64,ch = load<(load 8 from constant-pool)> 0x7f774b379e58, 0x7f774b281c98, undef:i64
        0x7f774b281c98: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i64 2251799813685247> 0
          0x7f774b281b60: i64 = TargetConstantPool<i64 2251799813685247> 0
        0x7f774b2812d8: i64 = undef
In function: _ZN50_$LT$T$u20$as$u20$core..convert..Into$LT$U$GT$$GT$4into17hb66a3d1abfd3e3a6E
error: Could not compile `curve25519-dalek`.
warning: build failed, waiting for other jobs to finish...
LLVM ERROR: Cannot select: 0x7fd212bae2d8: v4i64 = X86ISD::VPMADD52L 0x7fd212741820, 0x7fd212baea90, 0x7fd212bae410
  0x7fd212741820: v4i64 = X86ISD::VSRLI 0x7fd212741bc8, Constant:i8<51>
    0x7fd212741bc8: v4i64,ch = load<(dereferenceable load 32 from %ir.11)> 0x7fd2127411a0, 0x7fd212bae3a8, undef:i64
      0x7fd212bae3a8: i64 = add nuw 0x7fd212741888, Constant:i64<128>
        0x7fd212741888: i64,ch = CopyFromReg 0x7fd212c05fd8, Register:i64 %1
          0x7fd212741d00: i64 = Register %1
        0x7fd212a9ac30: i64 = Constant<128>
      0x7fd212bae068: i64 = undef
    0x7fd212bae340: i8 = Constant<51>
  0x7fd212baea90: v4i64 = X86ISD::VBROADCAST 0x7fd2127ce270
    0x7fd2127ce270: i64,ch = load<(load 8 from constant-pool)> 0x7fd212c05fd8, 0x7fd212a9ac98, undef:i64
      0x7fd212a9ac98: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i64 19> 0
        0x7fd212741dd0: i64 = TargetConstantPool<i64 19> 0
      0x7fd212bae068: i64 = undef
  0x7fd212bae410: v4i64 = and 0x7fd212741068, 0x7fd212a9a618
    0x7fd212741068: v4i64,ch = load<(dereferenceable load 32 from %ir.3)> 0x7fd2127411a0, 0x7fd212741888, undef:i64
      0x7fd212741888: i64,ch = CopyFromReg 0x7fd212c05fd8, Register:i64 %1
        0x7fd212741d00: i64 = Register %1
      0x7fd212bae068: i64 = undef
    0x7fd212a9a618: v4i64 = X86ISD::VBROADCAST 0x7fd212a9a478
      0x7fd212a9a478: i64,ch = load<(load 8 from constant-pool)> 0x7fd212c05fd8, 0x7fd212a9a6e8, undef:i64
        0x7fd212a9a6e8: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i64 2251799813685247> 0
          0x7fd212bae270: i64 = TargetConstantPool<i64 2251799813685247> 0
        0x7fd212bae068: i64 = undef
In function: _ZN50_$LT$T$u20$as$u20$core..convert..Into$LT$U$GT$$GT$4into17h84437aaee38c4845E
error: Could not compile `curve25519-dalek`.

Rust version:

rustc 1.35.0-nightly (96d700f1b 2019-04-10)
binary: rustc
commit-hash: 96d700f1b7bc9c53fa0d11567adb1ed2c1c27e79
commit-date: 2019-04-10
host: x86_64-unknown-linux-gnu
release: 1.35.0-nightly
LLVM version: 8.0

Any ideas, if this is an error on the rust compiler or the linking code being used in the backend?

hdevalence commented 5 years ago

Do you have a Cannonlake CPU? This looks like an error that might result from specifying a target feature that your CPU doesn't support. You can check via cat /proc/cpuinfo | grep flags.

dignifiedquire commented 5 years ago

According to proc/cpuinfo I have

vendor_id   : GenuineIntel
cpu family  : 6
model       : 102
model name  : Intel(R) Core(TM) i3-8121U CPU @ 2.20GHz
stepping    : 3
microcode   : 0x2a
cpu MHz     : 600.211
cache size  : 4096 KB
physical id : 0
siblings    : 4
core id     : 1
cpu cores   : 2
apicid      : 3
initial apicid  : 3
fpu     : yes
fpu_exception   : yes
cpuid level : 22
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx avx512f avx512dq rdseed adx smap avx512ifma clflushopt intel_pt avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp hwp_pkg_req avx512vbmi umip pku ospke flush_l1d
bugs        : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf
bogomips    : 4416.00
clflush size    : 64
cache_alignment : 64
address sizes   : 39 bits physical, 48 bits virtual
dignifiedquire commented 5 years ago

If the extension doesn't exist I usually get SIGILL illegal instruction. But this seems to be an issue with llvm not being able to lower the instructions properly.

hdevalence commented 5 years ago

HmmmMMMMmmmm, interesting. I'll try taking a look at this next week, but I might not have access to my Cannonlake CPU. There might be a problem because the implementation was using LLVM internals manually, since the intrinsics didn't exist when I wrote it.

hdevalence commented 5 years ago

Related: #256, #257 . On my cannonlake machine the current develop branch works, but trying to remove the llvm intrinsics causes an ICE on skylake.