JohnSnowLabs / spark-nlp

State of the Art Natural Language Processing
https://sparknlp.org/
Apache License 2.0
3.77k stars 705 forks source link

SparkNLP 1005 implement nomic embeddings #14217

Open prabod opened 4 months ago

prabod commented 4 months ago

This PR introduces nomic embeddings to Spark NLP

Description

nomic-embed-text-v1 is 8192 context length text encoder that surpasses OpenAI text-embedding-ada-002 and text-embedding-3-small performance on short and long context tasks.

Types of changes

Checklist:

jtattersall09403 commented 3 days ago

Hi, has there been any further progress on this PR? Are there any estimated timescales for when it might be completed? Thanks!