-
**Is your feature request related to a problem? Please describe.**
Sometimes we have a set of uncategorized texts. Either we do not care about categorization models, or we want to create a model to h…
-
# Support Batching in Map Operations
## Background
Currently, map operations execute an LLM call per input document. For very small documents, it's plausible we can execute on multiple docs togeth…
-
When using the Triplet loss - we try to minimize the distance between each pair `(a_1, p_1)` while maximizing the distance between `(a_1,p_j), j!=1`.
I'm trying to solve the following; for given se…
-
Given the volume of data indexed, there is a need to have a way to search for the right tool. This includes creating themes for the repositories, as well as more metadata for each repository at a late…
-
### Description
When I press the cluster button, my screen freezes for a few seconds (expected) but there is no chat log message regarding clustering starting. And when it unfreezes, no text chat log…
-
-
# Clustering Documents with OpenAI, LangChain, and HDBSCAN
This article will teach you how to cluster text data with LLMs using cutting-edge tools.
[https://dylancastillo.co/clustering-documents-wit…
-
jina-embeddings-v3 is a multilingual multi-task text embedding model designed for a variety of NLP applications. Based on the [Jina-XLM-RoBERTa architecture](https://huggingface.co/jinaai/xlm-roberta-…
-
I am requesting to add support for [IPinfo's IP to Country database](https://ipinfo.io/products/free-ip-database) to the project. The database has the following features:
- It includes country and …
-
Hello,
I would like to express my sincere appreciation for your passionate communication and efficient package management. I have reviewed the documentation and code related to the use of UMAP, but…