OpenMathLib / OpenBLAS

OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version.
http://www.openblas.net
BSD 3-Clause "New" or "Revised" License
6.38k stars 1.5k forks source link

coredump in segmm_ #4892

Closed sandrew11 closed 2 weeks ago

sandrew11 commented 1 month ago

0 0x000055eca5413a75 in sgemm_incopy ()

1 0x000055eca53f670f in sgemm_tn ()

2 0x000055eca53f4d79 in sgemm_ ()

3 0x000055eca52d9bda in faiss::(anonymous namespace)::exhaustive_inner_product_blas<faiss::HeapResultHandler<faiss::CMin<float, long> > > (res=..., ny=64, nx=64, d=128, y=0x7fca661be000,

x=<optimized out>) at third_party/faiss/faiss/faiss/impl/ResultHandler.h:88

4 faiss::knn_inner_product (x=, y=, d=128, nx=64, ny=, k=1, val=0x7fca62d35200, ids=0x7fca661dfc00, sel=)

at third_party/faiss/faiss/faiss/utils/distances.cpp:636

5 0x000055eca52d9d8e in faiss::knn_inner_product (x=, y=, d=, nx=, ny=, res=res@entry=0x7fca58dbe1c0, sel=0x0)

at third_party/faiss/faiss/faiss/utils/distances.cpp:667

There is coredump in some linux environment. Asking for everyone's help, please tell me the reason for this coredump

martin-frbg commented 1 month ago

Which version of OpenBLAS, which cpu, which compiler, what does the code and data look like that leads to the coredump ?

sandrew11 commented 1 month ago

Which version of OpenBLAS, which cpu, which compiler, what does the code and data look like that leads to the coredump ? openblas version: 0.3.21 lscpu: Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 42 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 16 On-line CPU(s) list: 0-15 Vendor ID: GenuineIntel Model name: Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz CPU family: 6 Model: 79 Thread(s) per core: 1 Core(s) per socket: 1 Socket(s): 16 Stepping: 1 BogoMIPS: 5187.98 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm rdseed adx smap xsaveopt arat flush_l1d Virtualization features:
Hypervisor vendor: KVM Virtualization type: full Caches (sum of all):
L1d: 512 KiB (16 instances) L1i: 512 KiB (16 instances) L2: 64 MiB (16 instances) L3: 256 MiB (16 instances) NUMA:
NUMA node(s): 1 NUMA node0 CPU(s): 0-15 Vulnerabilities:
Gather data sampling: Not affected Itlb multihit: KVM: Mitigation: VMX unsupported L1tf: Mitigation; PTE Inversion Mds: Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown Meltdown: Vulnerable Mmio stale data: Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown Reg file data sampling: Not affected Retbleed: Not affected Spec rstack overflow: Not affected Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization Spectre v2: Vulnerable, IBPB: disabled, STIBP: disabled, PBRSB-eIBRS: Not affected Srbds: Not affected Tsx async abort: Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown

compiler:gcc

in faiss::exhaustive_inner_productblas https://github.com/facebookresearch/faiss/blob/main/faiss/utils/distances.cpp sgemm("Transpose", "Not transpose", &nyi, &nxi, &di, &one, y + j0 d, &di, x + i0 d, &di, &zero, ip_block.get(), &nyi);

martin-frbg commented 1 month ago

I do not know FAISS, what do I need to do to trigger the error there - is it with a supplied example ? But 0.3.21 is two years old, there is a good chance that this was resolved in the meantime.

martin-frbg commented 1 month ago

I cannot reproduce this with simple cases based on the examples in the FAISS tutorial, please provide a code sample that demonstrates the problem.