JohnSnowLabs / spark-nlp

State of the Art Natural Language Processing
https://sparknlp.org/
Apache License 2.0
3.76k stars 703 forks source link

Can not find the model to download bge-m3 #14243

Closed avivshafir closed 2 months ago

avivshafir commented 2 months ago

Is there an existing issue for this?

Who can help?

No response

What are you working on?

Sentence embedding in pyspark

Current Behavior

The model is not being found

Expected Behavior

Model should be found

Steps To Reproduce

document_assembler = DocumentAssembler().setInputCol("text").setOutputCol("document")

sentencerDL = (
    SentenceDetectorDLModel.pretrained("sentence_detector_dl", "xx")
    .setInputCols(["document"])
    .setOutputCol("sentence")
)

embeddings = (
    XlmRoBertaSentenceEmbeddings.pretrained("bge_m3 ","xx")
    .setInputCols(["sentence"])
    .setOutputCol("embeddings")
)

pipeline = Pipeline().setStages([document_assembler, sentencerDL, embeddings])

df = spark.createDataFrame([["John Doe", "Jane Doe"]]).toDF("text")

Spark NLP version and Apache Spark

sparknlp=='5.3.3' jar version=5.3.3 spark_version='3.5.0'

databricks runtime version: 14.3.x-gpu-ml-scala2.12

Type of Spark Application

spark-submit

Java Version

No response

Java Home Directory

No response

Setup and installation

No response

Operating System and Version

No response

Link to your project (if available)

No response

Additional Information

No response

maziyarpanahi commented 2 months ago

This is our fault, the code snippet has an extra space in the name. Thanks for reporting it, I'll fix the model's card.