neo4j-labs / llm-graph-builder

Neo4j graph construction from unstructured data using LLMs
https://neo4j.com/labs/genai-ecosystem/llm-graph-builder/
Apache License 2.0
2.16k stars 329 forks source link

Regarding the issue of EMBEDDINGYMODEL having dimensions of 768 and 1024 #609

Closed gy850222 closed 2 months ago

gy850222 commented 2 months ago

I tried using EMBEDDIND_MODEL for other models and found that both models with dimensions of 768 and 1024 had issues. For example: model_name="intfloat/multilingual-e5-base", cache_folder="/embedding_model" dimension = 768 model_name="intfloat/multilingual-e5-large-instruct", cache_folder="/embedding_model" dimension = 1024

.env:

EMBEDDING_MODEL = "distiluse-base-multilingual-cased-v2"

EMBEDDING_MODEL = "intfloat/multilingual-e5-small"

EMBEDDING_MODEL = "intfloat/multilingual-e5-large-instruct"

EMBEDDING_MODEL = "intfloat/multilingual-e5-base"

common_fn.py:

embeddings = SentenceTransformerEmbeddings(

model_name="all-MiniLM-L6-v2"#, cache_folder="/embedding_model"

        #model_name="paraphrase-multilingual-MiniLM-L12-v2", cache_folder="/embedding_model"
        #model_name="distiluse-base-multilingual-cased-v2", cache_folder="/embedding_model"
        #model_name="intfloat/multilingual-e5-small", cache_folder="/embedding_model"
        #model_name="intfloat/multilingual-e5-large-instruct", cache_folder="/embedding_model"
        model_name="intfloat/multilingual-e5-base", cache_folder="/embedding_model"

    )
    #dimension = 384
    dimension = 768
    #dimension = 512
    #dimension = 1024
    logging.info(f"Embedding: Using SentenceTransformer , Dimension:{dimension}")

Currently, only dimensions of 384 are available For example: EMBEDDING_MODEL = "intfloat/multilingual-e5-small"

log: Exception in post_processing tasks: {code: Neo.ClientError.Procedure.ProcedureCallFailed} {message: Failed to invoke procedure db.index.vector.queryNodes: Caused by: java.lang.IllegalArgumentException: Index query vector has 768 dimensions, but indexed vectors have 384.} @jexp

praveshkumar1988 commented 2 months ago

@gy850222 You need to drop existing vector index which have dimension 384. Please RUN Command drop index vector

gy850222 commented 2 months ago

@praveshkumar1988 Thank you for your reminder. I have successfully achieved it. Thank you.:)