microsoft / DiskANN

Graph-structured Indices for Scalable, Fast, Fresh and Filtered Approximate Nearest Neighbor Search
Other
1.06k stars 215 forks source link

[Program received signal SIGILL, Illegal instruction] #500

Closed code-orangemonster closed 8 months ago

code-orangemonster commented 9 months ago

When I use

./apps/utils/compute_groundtruth the program stops and reports an illegal instruction error. Using GDB, I discovered that the error occurs during the execution of void load_bin_as_float<float>(char const*, float*&, unsigned long&, unsigned long&, int) () After disabling the OpenMP directive #pragma omp parallel for schedule(dynamic, 32768)within that function, the program runs successfully. However, when I run into compute_groundtruth.cpp::processUnfilteredParts, the illegal instruction error continues to bother me. Disabling the initialization of std::vector<std::vector<std::pair<uint32_t, float>>> res(nqueries) within that method allows the program to run completely. The tragedy continues when I usebuild_disk_index.cpp to construct an index, as I am persistently plagued by an illegal instruction error:

#0  0x00005555555792ee in void boost::program_options::validate<float, char>(boost::any&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, float*, long) ()
#1  0x00007ffff7c69a74 in boost::program_options::store(boost::program_options::basic_parsed_options<char> const&, boost::program_options::variables_map&, bool) () from /lib/x86_64-linux-gnu/libboost_program_options.so.1.74.0
#2  0x000055555556b7e9 in main ()

This seems to be an issue with my CPU, but I would like to understand the specific reasons for each error. I hope to get answers. Thank you very much. `` gcc version 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04)

` processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 45 model name : Intel(R) Xeon(R) CPU E5-2420 0 @ 1.90GHz stepping : 7 microcode : 0x71a cpu MHz : 1200.000 cache size : 15360 KB physical id : 0 siblings : 12 core id : 0 cpu cores : 6 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid xsaveopt dtherm ida arat pln pts md_clear flush_l1d vmx flags : vnmi preemption_timer invvpid ept_x_only ept_1gb flexpriority tsc_offset vtpr mtf vapic ept vpid unrestricted_guest ple bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit mmio_unknown bogomips : 3800.06 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 48 bits virtual power management:

processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 45 model name : Intel(R) Xeon(R) CPU E5-2420 0 @ 1.90GHz stepping : 7 microcode : 0x71a cpu MHz : 1200.000 cache size : 15360 KB physical id : 0 siblings : 12 core id : 1 cpu cores : 6 apicid : 2 initial apicid : 2 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid xsaveopt dtherm ida arat pln pts md_clear flush_l1d vmx flags : vnmi preemption_timer invvpid ept_x_only ept_1gb flexpriority tsc_offset vtpr mtf vapic ept vpid unrestricted_guest ple bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit mmio_unknown bogomips : 3800.06 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 48 bits virtual power management: `

### Tasks
code-orangemonster commented 9 months ago

If I can only use this machine, which version can I install? Please recommend one for me. I would be very grateful.

daxpryce commented 8 months ago

Can you try to build and run this again like this?

# from root of git repository
cmake -S . -B build -DPORTABLE=True
cmake --build build -- -j #like 99% sure this is right, I've been off for two weeks and don't remember how anything works anymore
build/apps/utils/compute_groundtruth <your arguments here>

I've never run into an illegal instruction error if I'm building and running on the exact same logical machine. Thanks for your question @code-orangemonster!

daxpryce commented 8 months ago

I've never run into an illegal instruction error if I'm building and running on the exact same logical machine. Thanks for your question @code-orangemonster!

To clarify: this would be a new and exciting thing to learn about. Please do try the build with that PORTABLE flag on and see if you can replicate it again

code-orangemonster commented 8 months ago

I've never run into an illegal instruction error if I'm building and running on the exact same logical machine. Thanks for your question @code-orangemonster!

To clarify: this would be a new and exciting thing to learn about. Please do try the build with that PORTABLE flag on and see if you can replicate it again Thank you very much for your assistance. Following your instructions, the program is now running smoothly. I truly appreciate your help.