When loading this library twice in the same process and use it concurrently the process crashes due SegmentationFault error.
It happens when:
loading different or the same model
showing or not the progress bar
for the same or different inputs
any number of thread pool workers
Notes
I believe this issue is somewhat related to: https://github.com/UKPLab/sentence-transformers/issues/1854 however different from the referred issue the issue reported here only happens when using GPU, for CPU it never crashes. Overall this seems to be a PyTorch issue however I'm reporting here as I couldn't ensure this hypothesis.
import logging
from concurrent.futures import as_completed, ThreadPoolExecutor
from sentence_transformers import SentenceTransformer
logging.basicConfig(level="INFO")
executor = ThreadPoolExecutor(max_workers=11)
model1 = SentenceTransformer('all-MiniLM-L6-v2')
# happen with the same or different model
model2 = SentenceTransformer('all-MiniLM-L6-v2')
futures = {}
for i in range(100):
if i % 2 == 0:
s = ["something is wrong"]
futures[executor.submit(model1.encode, s, show_progress_bar=False)] = s
else:
s = ["this should work but it crashes"]
futures[executor.submit(model2.encode, s, show_progress_bar=False)] = s
for future in as_completed(futures):
if future.exception() is not None:
raise future.exception()
else:
print("Sentence:", futures[future])
print("Embedding:", len(future.result()))
print("")
Issue
When loading this library twice in the same process and use it concurrently the process crashes due SegmentationFault error. It happens when:
Notes
I believe this issue is somewhat related to: https://github.com/UKPLab/sentence-transformers/issues/1854 however different from the referred issue the issue reported here only happens when using GPU, for CPU it never crashes. Overall this seems to be a PyTorch issue however I'm reporting here as I couldn't ensure this hypothesis.
Hardware
Amazon p2.xlarge 1 GPU, 4 CPU, 61GB.
Software
Linux x86_64 GNU/Linux Python 3.9
requirements.txt
pip list
Minimal reproducible example: