This pull request includes updates to the chonkie library, introducing a new embedding model, updating documentation, and adding relevant tests. The most important changes include the addition of the Model2VecEmbeddings class, updates to the documentation to reflect new installation options, and the inclusion of tests for the new embedding model.
New Embedding Model:
Added Model2VecEmbeddings class to support the model2vec library, providing a new, efficient embedding model option. (src/chonkie/embeddings/model2vec.py, src/chonkie/embeddings/__init__.py, src/chonkie/__init__.py, src/chonkie/embeddings/registry.py) [1][2][3][4][5][6]
Documentation Updates:
Updated DOCS.md to include information about the new Model2VecEmbeddings and revised installation instructions for various embedding providers. [1][2][3][4][5]
Dependency and Version Updates:
Updated pyproject.toml to include model2vec as an optional dependency and incremented the package version to 0.2.1. [1][2]
Test Additions:
Added tests for Model2VecEmbeddings to ensure proper functionality and integration. (tests/embeddings/test_model2vec_embeddings.py)
These changes enhance the functionality of the chonkie library by adding support for a new, efficient embedding model and ensuring that the documentation and tests are updated accordingly.
This pull request includes updates to the
chonkie
library, introducing a new embedding model, updating documentation, and adding relevant tests. The most important changes include the addition of theModel2VecEmbeddings
class, updates to the documentation to reflect new installation options, and the inclusion of tests for the new embedding model.New Embedding Model:
Model2VecEmbeddings
class to support themodel2vec
library, providing a new, efficient embedding model option. (src/chonkie/embeddings/model2vec.py
,src/chonkie/embeddings/__init__.py
,src/chonkie/__init__.py
,src/chonkie/embeddings/registry.py
) [1] [2] [3] [4] [5] [6]Documentation Updates:
DOCS.md
to include information about the newModel2VecEmbeddings
and revised installation instructions for various embedding providers. [1] [2] [3] [4] [5]Dependency and Version Updates:
pyproject.toml
to includemodel2vec
as an optional dependency and incremented the package version to0.2.1
. [1] [2]Test Additions:
Model2VecEmbeddings
to ensure proper functionality and integration. (tests/embeddings/test_model2vec_embeddings.py
)These changes enhance the functionality of the
chonkie
library by adding support for a new, efficient embedding model and ensuring that the documentation and tests are updated accordingly.