UKPLab / sentence-transformers

State-of-the-Art Text Embeddings
https://www.sbert.net
Apache License 2.0
14.78k stars 2.43k forks source link

unable to initialize the model from README.md #2528

Open aabor opened 6 months ago

aabor commented 6 months ago

I have the following code as the first lines in my main function:

from sentence_transformers import SentenceTransformer
model = SentenceTransformer("all-MiniLM-L6-v2")

after start the following happens: 2024-03-04 20:14:50,252:INFO:Load pretrained SentenceTransformer: all-MiniLM-L6-v2 /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/multiprocessing/resource_tracker.py:254: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown warnings.warn('resource_tracker: There appear to be %d '

Process finished with exit code 139 (interrupted by signal 11: SIGSEGV)

It used to work well, but after recent system upgrade fails. Hardware Overview:

Model Name: MacBook Pro Model Identifier: Mac14,9 Model Number: MPHE3LL/A Chip: Apple M2 Pro Total Number of Cores: 10 (6 performance and 4 efficiency) Memory: 16 GB System Firmware Version: 10151.81.1 OS Loader Version: 10151.81.1 Activation Lock Status: Disabled

Also consider system crash report:



Translated Report (Full Report Below)

Process: Python [2020] Path: /Library/Frameworks/Python.framework/Versions/3.11/Resources/Python.app/Contents/MacOS/Python Identifier: org.python.python Version: 3.11.6 (3.11.6) Code Type: ARM-64 (Native) Parent Process: pycharm [756] Responsible: pycharm [756] User ID: 501

Date/Time: 2024-03-04 20:20:38.0394 -0800 OS Version: macOS 14.3.1 (23D60) Report Version: 12 Anonymous UUID: 66A06743-DA8E-FC87-662F-A0460A49AFF9

Time Awake Since Boot: 930 seconds

System Integrity Protection: enabled

Crashed Thread: 14

Exception Type: EXC_BAD_ACCESS (SIGSEGV) Exception Codes: KERN_INVALID_ADDRESS at 0x0000000000000008 Exception Codes: 0x0000000000000001, 0x0000000000000008

Termination Reason: Namespace SIGNAL, Code 11 Segmentation fault: 11 Terminating Process: exc handler [2020]

VM Region Info: 0x8 is not in any region. Bytes before following region: 4310024184 REGION TYPE START - END [ VSIZE] PRT/MAX SHRMOD REGION DETAIL UNUSED SPACE AT START --->
__TEXT 100e5c000-100e60000 [ 16K] r-x/r-x SM=COW .../MacOS/Python

Thread 0:: Dispatch queue: com.apple.main-thread 0 libsystem_kernel.dylib 0x184809638 bsdthread_create + 8 1 libsystem_pthread.dylib 0x184847738 _pthread_create + 1044 2 libomp.dylib 0x298d06b54 kmp_create_worker + 208 3 libomp.dylib 0x298cc8bb4 kmp_allocate_thread + 1060 4 libomp.dylib 0x298cc3644 kmp_allocate_team + 2320 5 libomp.dylib 0x298cc543c kmp_fork_call + 5884 6 libomp.dylib 0x298cb8088 kmpc_fork_call + 196 7 libtorch_cpu.dylib 0x2a52a5c38 at::TensorIteratorBase::for_each(c10::function_ref<void (char*, long long const, long long, long long)>, long long) + 424 8 libtorch_cpu.dylib 0x2a78a1328 at::native::DEFAULT::direct_copy_kernel(at::TensorIteratorBase&) + 348 9 libtorchcpu.dylib 0x2a56431b4 at::native::copy(at::Tensor&, at::Tensor const&, bool) + 2612 10 libtorch_cpu.dylib 0x2a96a4fc4 c10::impl::wrap_kernel_functorunboxed<c10::impl::detail::WrapFunctionIntoFunctor<c10::CompileTimeFunctionPointer<at::Tensor& (c10::DispatchKeySet, at::Tensor&, at::Tensor const&, bool), &torch::ADInplaceOrView::copy(c10::DispatchKeySet, at::Tensor&, at::Tensor const&, bool)>, at::Tensor&, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor&, at::Tensor const&, bool>>, at::Tensor& (c10::DispatchKeySet, at::Tensor&, at::Tensor const&, bool)>::call(c10::OperatorKernel, c10::DispatchKeySet, at::Tensor&, at::Tensor const&, bool) + 72 11 libtorch_cpu.dylib 0x2a96a2078 c10::impl::wrap_kernel_functorunboxed<c10::impl::detail::WrapFunctionIntoFunctor<c10::CompileTimeFunctionPointer<at::Tensor& (c10::DispatchKeySet, at::Tensor&, at::Tensor const&, bool), &torch::autograd::VariableType::(anonymous namespace)::copy(c10::DispatchKeySet, at::Tensor&, at::Tensor const&, bool)>, at::Tensor&, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor&, at::Tensor const&, bool>>, at::Tensor& (c10::DispatchKeySet, at::Tensor&, at::Tensor const&, bool)>::call(c10::OperatorKernel, c10::DispatchKeySet, at::Tensor&, at::Tensor const&, bool) + 576 12 libtorch_cpu.dylib 0x2a622e4c8 at::ops::copy::call(at::Tensor&, at::Tensor const&, bool) + 280 13 libtorch_python.dylib 0x299fe5d8c torch::autograd::THPVariablecopy(_object, _object, _object*) + 512 14 Python 0x102149784 cfunction_call + 60 15 Python 0x1020e4d78 _PyObject_MakeTpCall + 128 16 Python 0x10220c2b4 _PyEval_EvalFrameDefault + 53388 17 Python 0x1022119ec _PyEval_Vector + 156

tomaarsen commented 6 months ago

Hello!

This feels like an installation issue with torch. Perhaps it is worth reinstalling that package? I don't see anything in the logs that specifically point to sentence-transformers.

/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/multiprocessing/resource_tracker.py:254: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown warnings.warn('resource_tracker: There appear to be %d '

I'm also not sure why you get a multiprocessing warning. Are you using multiprocessing in your code elsewhere? Searching online indicates that it could be due to RAM issues? I.e. loading too much data into RAM.