microsoft / promptflow

Build high-quality LLM apps - from prototyping, testing to production deployment and monitoring.
https://microsoft.github.io/promptflow/
MIT License
9.43k stars 854 forks source link

[BUG] CUDA Error in Forked Subprocess #2465

Closed sashokbg closed 7 months ago

sashokbg commented 7 months ago

Describe the bug When creating a batch run one of my nodes fails with the following error:

Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method",
                                        "stackTrace": "Traceback (most recent call last):\n  File \"/home/alexander/Games2/degiro-faq-assistant_venv/.venv/lib/python3.11/site-packages/promptflow/_core/flow_execution_context.py\", line 190, in _invoke_tool_with_timer\n    return f(**kwargs)\n           ^^^^^^^^^^^\n  File \"/home/alexander/Games2/degiro-faq-assistant_venv/.venv/lib/python3.11/site-packages/promptflow/_core/tracer.py\", line 528, in wrapped\n    output = func(*args, **kwargs)\n             ^^^^^^^^^^^^^^^^^^^^^\n  File \"/home/alexander/Games2/degiro-faq-assistant/vector_search.py\", line 27, in vector_search\n    docs = store.similarity_search_with_score(question, k=2)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/home/alexander/Games2/degiro-faq-assistant_venv/.venv/lib/python3.11/site-packages/langchain_community/vectorstores/pgvector.py\", line 467, in similarity_search_with_score\n    embedding = self.embedding_function.embed_query(query)\n                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/home/alexander/Games2/degiro-faq-assistant_venv/.venv/lib/python3.11/site-packages/langchain_community/embeddings/huggingface.py\", line 108, in embed_query\n    return self.embed_documents([text])[0]\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/home/alexander/Games2/degiro-faq-assistant_venv/.venv/lib/python3.11/site-packages/langchain_community/embeddings/huggingface.py\", line 93, in embed_documents\n    embeddings = self.client.encode(\n                 ^^^^^^^^^^^^^^^^^^^\n  File \"/home/alexander/Games2/degiro-faq-assistant_venv/.venv/lib/python3.11/site-packages/sentence_transformers/SentenceTransformer.py\", line 337, in encode\n    self.to(device)\n  File \"/home/alexander/Games2/degiro-faq-assistant_venv/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py\", line 1152, in to\n    return self._apply(convert)\n           ^^^^^^^^^^^^^^^^^^^^\n  File \"/home/alexander/Games2/degiro-faq-assistant_venv/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py\", line 802, in _apply\n    module._apply(fn)\n  File \"/home/alexander/Games2/degiro-faq-assistant_venv/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py\", line 802, in _apply\n    module._apply(fn)\n  File \"/home/alexander/Games2/degiro-faq-assistant_venv/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py\", line 802, in _apply\n    module._apply(fn)\n  [Previous line repeated 1 more time]\n  File \"/home/alexander/Games2/degiro-faq-assistant_venv/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py\", line 825, in _apply\n    param_applied = fn(param)\n                    ^^^^^^^^^\n  File \"/home/alexander/Games2/degiro-faq-assistant_venv/.venv/lib/python3.11/site-packages/torch/nn/modules/module.py\", line 1150, in convert\n    return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/home/alexander/Games2/degiro-faq-assistant_venv/.venv/lib/python3.11/site-packages/torch/cuda/__init__.py\", line 288, in _lazy_init\n    raise RuntimeError(\n",
                                        "innerException": null

How To Reproduce the bug My node uses lang chain PGVector store:

import os

from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_community.vectorstores.pgvector import PGVector
from promptflow import tool

CONNECTION_STRING = PGVector.connection_string_from_db_params(
    driver=os.environ.get("PGVECTOR_DRIVER", "psycopg2"),
    host=os.environ.get("PGVECTOR_HOST", "localhost"),
    port=int(os.environ.get("PGVECTOR_PORT", "5433")),
    database=os.environ.get("PGVECTOR_DATABASE", "postgres"),
    user=os.environ.get("PGVECTOR_USER", "postgres"),
    password=os.environ.get("PGVECTOR_PASSWORD", "postgres"),
)

embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")

@tool
def vector_search(question) -> list[dict]:
    store = PGVector(
        collection_name="embeddings",
        connection_string=CONNECTION_STRING,
        embedding_function=embeddings,
    )

    docs = store.similarity_search_with_score(question, k=2)

    result = []

    for doc, score in docs:
        result.append({
            "score": round(1 - score, 2),
            "content": doc.page_content,
            "title": doc.metadata['title'],
            "link": doc.metadata['link']
        })

    return result

Running Information(please complete the following information):

Additional context PyTorch discussion on github: https://github.com/pytorch/pytorch/issues/40403

As a workaround I am using

model_kwargs = {'device': 'cpu'}
embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2", model_kwargs=model_kwargs)

As mentioned in the doc https://api.python.langchain.com/en/latest/embeddings/langchain_community.embeddings.huggingface.HuggingFaceEmbeddings.html

Hhhilulu commented 7 months ago

Hello, @sashokbg When executing the batch run, we will use the system's default process startup mode to create multiple processes to increase the parallelism of execution.

According to your provided information, you are using a Linuxsystem. The default process startup method of the Linux system is fork, so an error is reported when executing your batch run.

You can configure the process startup mode to spawn according to the below doc, which may help you resolve the problem. Related Link: How to configure environment variables

sashokbg commented 7 months ago

@Hhhilulu I am very sorry, I missed this part of the readme !