issues
search
jina-ai
/
late-chunking
Code for explaining and evaluating late chunking (chunked pooling)
Apache License 2.0
244
stars
29
forks
source link
feat: add semantic chunking to eval script; add wrapper for minilm
#11
Closed
guenthermi
closed
1 month ago
guenthermi
commented
1 month ago
Adds the embedding model arguments to make it possible to execute semantic chunking
Adds SentenceTransformers wrapper to execute MiniLM
The current semantic chunking implementation omits the punctuation signs in the chunking. This PR changes it to include them
Makes the semantic chunking test more comprehensive