Open stephanedebove opened 10 months ago
Does fuzzy_citation need a specific tokenizer, text splitter or llm in order to work properly? I’m using
embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en", max_length=512)
service_context = ServiceContext.from_defaults( llm=llm, embed_model=embed_model, chunk_size=512, chunk_overlap=20 )
and zephyr-7b-beta for llm, but the extracted parts of the source node used for the response are always a bit off (and for some prompts, I get an IndexError: list index out of range error)
IndexError: list index out of range
I think @logan-markewich can answer this better 🦇🔦
Question Validation
Question
Does fuzzy_citation need a specific tokenizer, text splitter or llm in order to work properly? I’m using
embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en", max_length=512)
and zephyr-7b-beta for llm, but the extracted parts of the source node used for the response are always a bit off (and for some prompts, I get an
IndexError: list index out of range
error)