hwchase17 / langchain-hub

3.25k stars 263 forks source link

Customized Embedding Hub - Examples, Datasets, Pre-Trained Matrices #18

Open Glavin001 opened 1 year ago

Glavin001 commented 1 year ago

Problem

The default embeddings (e.g. Ada-002 from OpenAI, etc) are great generalists. However, they are not tailored for your specific use-case.

Proposed Solution

🎉 Customizing Embeddings!

ℹ️ See my tutorial / lessons learned if you're interested in learning more, step-by-step, with screenshots and tips.

🎯 Specifically for Lanchain Hub would be providing a collection of pre-trained custom embeddings.

Similar to https://huggingface.co/models except focused on semantic embeddings. List the known tasks so developers can search the available custom embeddings for each:

Hub provides a set of Tasks each with:

Leverage Langchain's helpers to help train and use the custom embedding matrix: