rapidfuzz / RapidFuzz

Rapid fuzzy string matching in Python using various string metrics
https://rapidfuzz.github.io/RapidFuzz/
MIT License
2.61k stars 116 forks source link

"import rapidfuzz" causes "illegal hardware instruction" #383

Open workflowsguy opened 4 months ago

workflowsguy commented 4 months ago

While I had used rapidfuzz previously without issues in my Python scripts, now every script that uses it crashes with an error such as illegal hardware instruction (Console) or Terminated due to signal: ILLEGAL INSTRUCTION (4) (Development environment)

The simplest script to demonstrate this just consists of

import rapidfuzz

Versions: rapidfuzz 3.9.0 (installed with pip install rapidfuzz) Python 3.11.9

After I reverted to rapidfuzz 3.8.1, the issue did not occur anymore

Thanks!

maxbachmann commented 4 months ago

strange nothing really changed between these two versions that should affect the C++ version. Can you provide some context:

workflowsguy commented 4 months ago

Machine information:

macOS 10.13.6 Intel(R) Core(TM) i7 CPU 870 @ 2.93GHz

maxbachmann commented 3 months ago

I think I have an idea what this could be. I did build the wheels using the macos-latest image. Apparently github moved this from macOS 12 x86 to macOS 14 arm in between the builds. Possibly this raised e.g. the min macOS version.

I will make a test build using macOS 13 x86 and link you the wheel here for a test install once it's finished for you to test.

maxbachmann commented 3 months ago

Can you test this wheel? artifact-build_wheels_macos-3.zip

workflowsguy commented 3 months ago

I installed the wheel with pip install ~/rapidfuzz-3.9.0-cp311-cp311-macosx_10_9_x86_64.whl

After that, running the test script still causes illegal hardware instructionand Python crashes. The first lines of the crash report follow. Maybe this can give you a clue...

Process:               Python [3585]
Path:                  /opt/local/Library/Frameworks/Python.framework/Versions/3.11/Resources/Python.app/Contents/MacOS/Python
Identifier:            Python
Version:               3.11.9 (3.11.9)
Code Type:             X86-64 (Native)
Parent Process:        zsh [750]
Responsible:           Python [3585]
User ID:               501

Date/Time:             2024-05-18 11:52:12.219 +0200
OS Version:            Mac OS X 10.13.6 (17G14042)
Report Version:        12
Anonymous UUID:        A41421B4-F1A6-23F0-F507-8C741C7D2643

Sleep/Wake UUID:       269CD014-9694-4B86-8B91-B0410B57CBA8

Time Awake Since Boot: 18000 seconds
Time Since Wake:       3700 seconds

System Integrity Protection: enabled

Crashed Thread:        0  Dispatch queue: com.apple.main-thread

Exception Type:        EXC_BAD_INSTRUCTION (SIGILL)
Exception Codes:       0x0000000000000001, 0x0000000000000000
Exception Note:        EXC_CORPSE_NOTIFY

Termination Signal:    Illegal instruction: 4
Termination Reason:    Namespace SIGNAL, Code 0x4
Terminating Process:   exc handler [0]

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0   metrics_cpp_avx2.cpython-311-darwin.so  0x000000010abc1a52 __pyx_pymod_exec_metrics_cpp_avx2(_object*) + 8242
1   org.python.python               0x000000010a064de2 PyModule_ExecDef + 114
2   org.python.python               0x000000010a16873e _imp_exec_dynamic + 12
3   org.python.python               0x000000010a062bf8 cfunction_vectorcall_O + 104
4   org.python.python               0x000000010a11b35e _PyEval_EvalFrameDefault + 62106
5   org.python.python               0x000000010a10adce PyEval_EvalCode + 281
6   org.python.python               0x000000010a1058b5 builtin_exec + 491
7   org.python.python               0x000000010a062a53 cfunction_vectorcall_FASTCALL_KEYWORDS + 95
8   org.python.python               0x000000010a11b35e _PyEval_EvalFrameDefault + 62106
9   org.python.python               0x000000010a11dd6e _PyEval_Vector + 96
10  org.python.python               0x000000010a00f9bf object_vacall + 296
11  org.python.python               0x000000010a00f808 PyObject_CallMethodObjArgs + 227
12  org.python.python               0x000000010a163150 PyImport_ImportModuleLevelObject + 2957
13  org.python.python               0x000000010a103044 builtin___import__ + 199
14  org.python.python               0x000000010a062a53 cfunction_vectorcall_FASTCALL_KEYWORDS + 95
15  org.python.python               0x000000010a11b35e _PyEval_EvalFrameDefault + 62106
16  org.python.python               0x000000010a11dd6e _PyEval_Vector + 96
17  org.python.python               0x000000010a00f9bf object_vacall + 296
18  org.python.python               0x000000010a00f808 PyObject_CallMethodObjArgs + 227
19  org.python.python               0x000000010a162878 PyImport_ImportModuleLevelObject + 693
20  org.python.python               0x000000010a114a89 _PyEval_EvalFrameDefault + 35269
21  org.python.python               0x000000010a10adce PyEval_EvalCode + 281
22  org.python.python               0x000000010a1058b5 builtin_exec + 491
23  org.python.python               0x000000010a062a53 cfunction_vectorcall_FASTCALL_KEYWORDS + 95
24  org.python.python               0x000000010a11b35e _PyEval_EvalFrameDefault + 62106
25  org.python.python               0x000000010a11dd6e _PyEval_Vector + 96
26  org.python.python               0x000000010a00f9bf object_vacall + 296
27  org.python.python               0x000000010a00f808 PyObject_CallMethodObjArgs + 227
28  org.python.python               0x000000010a163150 PyImport_ImportModuleLevelObject + 2957
29  org.python.python               0x000000010a103044 builtin___import__ + 199
30  org.python.python               0x000000010a062a53 cfunction_vectorcall_FASTCALL_KEYWORDS + 95
31  org.python.python               0x000000010a11b35e _PyEval_EvalFrameDefault + 62106
32  org.python.python               0x000000010a11dd6e _PyEval_Vector + 96
33  org.python.python               0x000000010a00f9bf object_vacall + 296
34  org.python.python               0x000000010a00f808 PyObject_CallMethodObjArgs + 227
35  org.python.python               0x000000010a162878 PyImport_ImportModuleLevelObject + 693
36  org.python.python               0x000000010a114a89 _PyEval_EvalFrameDefault + 35269
37  org.python.python               0x000000010a10adce PyEval_EvalCode + 281
38  org.python.python               0x000000010a1058b5 builtin_exec + 491
39  org.python.python               0x000000010a062a53 cfunction_vectorcall_FASTCALL_KEYWORDS + 95
40  org.python.python               0x000000010a11b35e _PyEval_EvalFrameDefault + 62106
41  org.python.python               0x000000010a11dd6e _PyEval_Vector + 96
42  org.python.python               0x000000010a00f9bf object_vacall + 296
43  org.python.python               0x000000010a00f808 PyObject_CallMethodObjArgs + 227
44  org.python.python               0x000000010a163150 PyImport_ImportModuleLevelObject + 2957
45  org.python.python               0x000000010a114a89 _PyEval_EvalFrameDefault + 35269
46  org.python.python               0x000000010a10adce PyEval_EvalCode + 281
47  org.python.python               0x000000010a18c9e1 run_eval_code_obj + 78
48  org.python.python               0x000000010a18ce31 run_mod + 96
49  org.python.python               0x000000010a18cdaa pyrun_file + 133
50  org.python.python               0x000000010a18c3dd _PyRun_SimpleFileObject + 558
51  org.python.python               0x000000010a18b8b6 _PyRun_AnyFileObject + 136
52  org.python.python               0x000000010a1b2e48 Py_RunMain + 2222
53  org.python.python               0x000000010a1b3af0 pymain_main + 482
54  org.python.python               0x000000010a1b3c74 Py_BytesMain + 42
55  libdyld.dylib                   0x00007fff71a44015 start + 1

Thread 0 crashed with X86 Thread State (64-bit):
  rax: 0x000000010ae3571f  rbx: 0x000000010ae444eb  rcx: 0x000000010ae4b1a8  rdx: 0x000000010ae4ae68
  rdi: 0x000000010ab1a6d0  rsi: 0x000000010ab50170  rbp: 0x00007ffee5c7c220  rsp: 0x00007ffee5c7b000
   r8: 0x3891b6418558226a   r9: 0x10b1d3cd09b79304  r10: 0x000000010ab501b8  r11: 0x000000010ab501a0
  r12: 0x000000010ab1a480  r13: 0x000000010aba53e8  r14: 0x000000010ae35420  r15: 0x0000000000000000
  rip: 0x000000010abc1a52  rfl: 0x0000000000010206  cr2: 0x000000010abc1000

Logical CPU:     6
Error Code:      0x00000000
Trap Number:     6
maxbachmann commented 3 months ago

Thanks. Can you test the following package as well? artifact-build_wheels_macos-0.zip

This is version 3.8.1 built with the current toolchains since I suspect this is a build issue and not an issue with the library itself.

workflowsguy commented 3 months ago

This also causes the error message illegal hardware instruction in the console, followed by a cash of Python:

Process:               Python [61921]
Path:                  /opt/local/Library/Frameworks/Python.framework/Versions/3.11/Resources/Python.app/Contents/MacOS/Python
Identifier:            Python
Version:               3.11.9 (3.11.9)
Code Type:             X86-64 (Native)
Parent Process:        zsh [779]
Responsible:           Python [61921]
User ID:               501

Date/Time:             2024-05-19 10:40:48.508 +0200
OS Version:            Mac OS X 10.13.6 (17G14042)
Report Version:        12
Anonymous UUID:        A41421B4-F1A6-23F0-F507-8C741C7D2643

Time Awake Since Boot: 16000 seconds

System Integrity Protection: enabled

Crashed Thread:        0  Dispatch queue: com.apple.main-thread

Exception Type:        EXC_BAD_INSTRUCTION (SIGILL)
Exception Codes:       0x0000000000000001, 0x0000000000000000
Exception Note:        EXC_CORPSE_NOTIFY

Termination Signal:    Illegal instruction: 4
Termination Reason:    Namespace SIGNAL, Code 0x4
Terminating Process:   exc handler [0]

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0   metrics_cpp_avx2.cpython-311-darwin.so  0x0000000103d76543 __pyx_pymod_exec_metrics_cpp_avx2(_object*) + 8179
1   org.python.python               0x0000000103219de2 PyModule_ExecDef + 114
2   org.python.python               0x000000010331d73e _imp_exec_dynamic + 12
3   org.python.python               0x0000000103217bf8 cfunction_vectorcall_O + 104
4   org.python.python               0x00000001032d035e _PyEval_EvalFrameDefault + 62106
5   org.python.python               0x00000001032bfdce PyEval_EvalCode + 281
6   org.python.python               0x00000001032ba8b5 builtin_exec + 491
7   org.python.python               0x0000000103217a53 cfunction_vectorcall_FASTCALL_KEYWORDS + 95
8   org.python.python               0x00000001032d035e _PyEval_EvalFrameDefault + 62106
9   org.python.python               0x00000001032d2d6e _PyEval_Vector + 96
10  org.python.python               0x00000001031c49bf object_vacall + 296
11  org.python.python               0x00000001031c4808 PyObject_CallMethodObjArgs + 227
12  org.python.python               0x0000000103318150 PyImport_ImportModuleLevelObject + 2957
13  org.python.python               0x00000001032b8044 builtin___import__ + 199
14  org.python.python               0x0000000103217a53 cfunction_vectorcall_FASTCALL_KEYWORDS + 95
15  org.python.python               0x00000001032d035e _PyEval_EvalFrameDefault + 62106
16  org.python.python               0x00000001032d2d6e _PyEval_Vector + 96
17  org.python.python               0x00000001031c49bf object_vacall + 296
18  org.python.python               0x00000001031c4808 PyObject_CallMethodObjArgs + 227
19  org.python.python               0x0000000103317878 PyImport_ImportModuleLevelObject + 693
20  org.python.python               0x00000001032c9a89 _PyEval_EvalFrameDefault + 35269
21  org.python.python               0x00000001032bfdce PyEval_EvalCode + 281
22  org.python.python               0x00000001032ba8b5 builtin_exec + 491
23  org.python.python               0x0000000103217a53 cfunction_vectorcall_FASTCALL_KEYWORDS + 95
24  org.python.python               0x00000001032d035e _PyEval_EvalFrameDefault + 62106
25  org.python.python               0x00000001032d2d6e _PyEval_Vector + 96
26  org.python.python               0x00000001031c49bf object_vacall + 296
27  org.python.python               0x00000001031c4808 PyObject_CallMethodObjArgs + 227
28  org.python.python               0x0000000103318150 PyImport_ImportModuleLevelObject + 2957
29  org.python.python               0x00000001032b8044 builtin___import__ + 199
30  org.python.python               0x0000000103217a53 cfunction_vectorcall_FASTCALL_KEYWORDS + 95
31  org.python.python               0x00000001032d035e _PyEval_EvalFrameDefault + 62106
32  org.python.python               0x00000001032d2d6e _PyEval_Vector + 96
33  org.python.python               0x00000001031c49bf object_vacall + 296
34  org.python.python               0x00000001031c4808 PyObject_CallMethodObjArgs + 227
35  org.python.python               0x0000000103317878 PyImport_ImportModuleLevelObject + 693
36  org.python.python               0x00000001032c9a89 _PyEval_EvalFrameDefault + 35269
37  org.python.python               0x00000001032bfdce PyEval_EvalCode + 281
38  org.python.python               0x00000001032ba8b5 builtin_exec + 491
39  org.python.python               0x0000000103217a53 cfunction_vectorcall_FASTCALL_KEYWORDS + 95
40  org.python.python               0x00000001032d035e _PyEval_EvalFrameDefault + 62106
41  org.python.python               0x00000001032d2d6e _PyEval_Vector + 96
42  org.python.python               0x00000001031c49bf object_vacall + 296
43  org.python.python               0x00000001031c4808 PyObject_CallMethodObjArgs + 227
44  org.python.python               0x0000000103318150 PyImport_ImportModuleLevelObject + 2957
45  org.python.python               0x00000001032c9a89 _PyEval_EvalFrameDefault + 35269
46  org.python.python               0x00000001032bfdce PyEval_EvalCode + 281
47  org.python.python               0x00000001033419e1 run_eval_code_obj + 78
48  org.python.python               0x0000000103341e31 run_mod + 96
49  org.python.python               0x0000000103341daa pyrun_file + 133
50  org.python.python               0x00000001033413dd _PyRun_SimpleFileObject + 558
51  org.python.python               0x00000001033408b6 _PyRun_AnyFileObject + 136
52  org.python.python               0x0000000103367e48 Py_RunMain + 2222
53  org.python.python               0x0000000103368af0 pymain_main + 482
54  org.python.python               0x0000000103368c74 Py_BytesMain + 42
55  libdyld.dylib                   0x00007fff557b1015 start + 1

Thread 0 crashed with X86 Thread State (64-bit):
  rax: 0x0000000103fea72f  rbx: 0x0000000103ff94eb  rcx: 0x00000001040001a0  rdx: 0x0000000103fffe68
  rdi: 0x0000000103cce860  rsi: 0x0000000103d00cb0  rbp: 0x00007ffeecac8220  rsp: 0x00007ffeecac7020
   r8: 0xc0b04e9999526c62   r9: 0x01c27fe65c0f7904  r10: 0x0000000103d00cf8  r11: 0x0000000103d00ce0
  r12: 0x0000000103cce610  r13: 0x0000000103d5a3e8  r14: 0x0000000103fea440  r15: 0x0000000000000000
  rip: 0x0000000103d76543  rfl: 0x0000000000010206  cr2: 0x000000010bd2a000

Logical CPU:     5
Error Code:      0x00000000
Trap Number:     6
maxbachmann commented 3 months ago

@henryiii can you have a quick look at this? You are usually more familiar with ecosystem changes in Python build tools.

To give a short recap of what is happening here: 1) mac os build from 07.04.2024 is working as expected https://github.com/rapidfuzz/RapidFuzz/actions/runs/8591457629/job/23540277660 2) mac os build of the exact same source code from 18.05.2024 is causing illegal instruction error on old versions of mac os https://github.com/rapidfuzz/RapidFuzz/actions/runs/9140181829/job/25133302598

So this sounds very much like something broke due to implicit updates of build tooling. Looking at the build log I noticed a couple of differences:

I don't see any other differences between the two builds, but maybe I am missing something. Do you have any idea what could go wrong here?

maxbachmann commented 3 months ago

Can you give the following build a try: artifact-build_wheels_macos-0.zip

This is a build without the avx2 versions

henryiii commented 3 months ago

As a first step, I'd recommend using macos-X and macos-14, where X < 14, building only native packages on each. (I have no idea why 12 isn't working, that was the default for a long time and cibuildwheel should be fine on it, certainly shouldn't have an issue!) If you want a universal wheel too (pip won't ever download it with the native packages present), then you can fuse the native packages into a universal2 package. (I think it's not too hard, but don't remember where I saw it done).

The only similar issue I've seen so far was that either uv or ruff builds with an old macOS because there was an issue using the newer Xcode when targeting older macOS. I haven't checked yet to see exactly what the issue is, they are building on macos-11 for now. (maturin-action and Rust, not cibuildwheel)

I technically do have access to a macOS <11 machine (10.12 IIRC - it's an iMac with an NVIDIA graphics card, it's that old!), but it's in my old office and I won't be going there probably at least for a week. (Currently at PyCon, so def. don't have access to it now!)

workflowsguy commented 3 months ago

Can you give the following build a try: artifact-build_wheels_macos-0.zip

This is a build without the avx2 versions

No crash from the test script 👍

henryiii commented 3 months ago

On a branch macOS-12 worked. But sounds like this is the correct fix. Does avx make it noticeably faster? You could make a higher minimum version wheel too with avx if it was important.

maxbachmann commented 3 months ago

On a branch macOS-12 worked.

I don't see anything directly obvious on what I might have done wrong in https://github.com/rapidfuzz/RapidFuzz/commit/177663a371577e4e56359221ad0d1bea51a92287 that would lead to no build targets being selected

But sounds like this is the correct fix.

Which fix do you mean? Disabling avx2 using macOs-12 or something else?

Does avx make it noticeably faster?

When performing many x many comparisons it actually yields close to a 2x performance improvement for some metrics, since it allows me to compare twice as many strings in parallel. So I would prefer to keep it. Currently I include the binary twice and just load the correct one based on CPU feature flags.

You could make a higher minimum version wheel too with avx if it was important.

Do you mean publishing both an 10.9+ version using only sse2 and a version targeting a later version which guarantees AVX2? If so which one does actually guarantee this? I mean the cpu in question here (i7 870) already supports AVX2.

henryiii commented 3 months ago

I don't see anything directly obvious

I don't either, but #384 passes.

Disabling avx2 using macOs-12 or something else?

I was thinking avx2, but it was previously working, so to me this sounds like a bug. What's odd though the 13+ (probably Xcode is the important part, the previous macOS always updates Xcode once to next macOS's Xcode) does seem to work except on older macOS. It would be interesting to list the binary symbols, I think.

I think since macos-12 does work (in #384), the simplest thing might be to use that (assuming it fixes the problem) until GHA drops the image. I think the macos-11 image is still usable but deprecated, so it's probably ~1 year unless there are hardware reasons to drop it sooner.

Do you mean publishing both an 10.9+ version using only sse2 and a version targeting a later version which guarantees AVX2?

Yes, but I don't know what version this would be, somewhere between 10.13 and 13; since it only happens on the actual macOS versions, it would be really hard to pin down.

I rather expect this is a bug and it might get ironed out in later releases, if we could pin it down a bit more possibly could even report it.

henryiii commented 3 months ago

I don't see anything directly obvious on what I might have done wrong in 177663a that would lead to no build targets being selected

You forgot a quote.

maxbachmann commented 3 months ago

I don't see anything directly obvious on what I might have done wrong in 177663a that would lead to no build targets being selected

You forgot a quote.

Oh true

maxbachmann commented 3 months ago

@workflowsguy I published a version with avx2 disabled yesterday to take a bit of the urgency out of this.

maxbachmann commented 3 months ago

I rather expect this is a bug and it might get ironed out in later releases, if we could pin it down a bit more possibly could even report it.

Yes I think so too. It already fails in Cythons init functions for the module because it comes across an unsupported instruction. Since I only enable AVX2 which is supported on the CPU in question this really shouldn't occur. So maybe an empty Cython module would be a good first test.

workflowsguy commented 3 months ago

@workflowsguy I published a version with avx2 yesterday to take a bit of the urgency out of this.

Thanks!

maxbachmann commented 3 months ago

I rather expect this is a bug and it might get ironed out in later releases, if we could pin it down a bit more possibly could even report it.

I can't really test this myself since I don't have access to a machine that leads to the crash to try and create a minimal reproducer. However this appears to already crash pretty early on while initializing the module. So we might be able to reduce the sample by quite a bit.