JohnSnowLabs / spark-nlp

State of the Art Natural Language Processing
https://sparknlp.org/
Apache License 2.0
3.77k stars 705 forks source link

Onnx models fail when saving transformer #14194

Closed mehmetbutgul closed 3 months ago

mehmetbutgul commented 4 months ago

Is there an existing issue for this?

Who can help?

@maziyarpanahi

What are you working on?

I am working on the Colab and local. I want to save a transformer that pretrained the Onnx model. But, I have faced an error. When I investigate the error, I have noticed that the main problem is serializing a onnx model --> https://github.com/JohnSnowLabs/spark-nlp/blob/43aa4b0e1c4f66cf7918f8ef5edce19047027542/src/main/scala/com/johnsnowlabs/ml/onnx/OnnxSerializeModel.scala#L31

Current Behavior

Error while saving the transformer, This error applies to most ONNX based annotators

Expected Behavior

writing the transformer stage to file

Steps To Reproduce

DeBertaEmbeddings.pretrained("deberta_embeddings_erlangshen_v2_chinese_sentencepiece","zh") \
     .write().overwrite().save("models/deberta_model")

Colab notebook: https://colab.research.google.com/drive/119u6hXoT1PRB9F38InuEV-bm4g1uu9UH?usp=sharing

Spark NLP version and Apache Spark

spark-nlp==5.3.1 pyspark

Type of Spark Application

No response

Java Version

No response

Java Home Directory

No response

Setup and installation

No response

Operating System and Version

No response

Link to your project (if available)

No response

Additional Information

No response