explodinggradients / ragas

Evaluation framework for your Retrieval Augmented Generation (RAG) pipelines
https://docs.ragas.io
Apache License 2.0
6.78k stars 667 forks source link

embedding nodes: 0%| Segmentation fault (core dumped) #962

Open WGS-note opened 4 months ago

WGS-note commented 4 months ago

hi,my code is as follows:

device = "cuda:2"

loader = DirectoryLoader("./knowledge_base/DT_test/content")
documents = loader.load()

for document in documents:
    document.metadata['filename'] = document.metadata['source']

pipe = pipeline(
    "text-generation",
    model=AutoModelForCausalLM.from_pretrained("/home/models/chatglm3-6b", trust_remote_code=True),
    tokenizer=AutoTokenizer.from_pretrained("/home/models/chatglm3-6b", trust_remote_code=True),
    device=device
)
generator_llm = HuggingFacePipeline(pipeline=pipe)

embeddings = HuggingFaceBgeEmbeddings(
    model_name="/home/models/bge-large-zh",
    model_kwargs={'device': device},
    encode_kwargs={'normalize_embeddings': True},
)

generator = TestsetGenerator.from_langchain(
    generator_llm=generator_llm,
    critic_llm=generator_llm,
    embeddings=embeddings
)

testset = generator.generate_with_langchain_docs(documents, test_size=3,
                                                distributions={simple: 0.5, reasoning: 0.25, multi_context: 0.25},
                                                is_async=False)

print(testset.to_pandas())

I am getting the following error:

embedding nodes:   0%|                                                                                                                                          | 0/28 [00:00<?, ?it/s]

Segmentation fault (core dumped)

may I ask what this is all about?

extremely grateful!

jjmachan commented 4 months ago

Hey @WGS-note are you still facing this. This is likey because of how the async is iterfacing with the local model and causing a crash. Is there any chance you can run this in an vLLM instance?