JohnSnowLabs / spark-nlp

State of the Art Natural Language Processing
https://sparknlp.org/
Apache License 2.0
3.83k stars 710 forks source link

Scala 2.13 support #14203

Open thirstler opened 6 months ago

thirstler commented 6 months ago

Description

Scala 2.12 seems to be de-facto across a lot of spark packages but I'm using packages that require spark w/scala-2.13 specifically and it's obviating working with Spark NLP (also, my understanding is that the Spark project will be deprecating scala-2.12 in 3.6). Is a Spark NLP build using Scala 2.13 possible?

Preferred Solution

a spark package for Scala 2.13 (e.g.: conf.set("spark.jars.packages", "com.johnsnowlabs.nlp:spark-nlp_2.13:5.3.1")

Additional Context

None

maziyarpanahi commented 6 months ago

Unfortunately, we are going to wait until Apache Spark introduces support for Scala 3.x. Moving to any major Scala version for Spark NLP means re-doing all of our saved pipelines and 99% of our models. (they have Java object, if they were saved in one major version they just cannot be used in another)

We tried to find solutions for this, but like many other libraries built on top of Apache Spark natively we also suffer from the saved models not being compatible in newer versions of Scala.

That said, if we have to go to a newer version of Scala, given the 3.0 has been out for while, we would rather wait until we can do this once for Scala 3.x support. (I believe before deprecating 2.12, they will introduce 3.x support)

github-actions[bot] commented 3 weeks ago

This issue is stale because it has been open 180 days with no activity. Remove stale label or comment or this will be closed in 5 days

SemyonSinchenko commented 3 weeks ago

I believe before deprecating 2.12, they will introduce 3.x support

As I can understand, Spark 4.0 works only with 2.13: https://issues.apache.org/jira/browse/SPARK-45314

maziyarpanahi commented 3 weeks ago

I believe before deprecating 2.12, they will introduce 3.x support

As I can understand, Spark 4.0 works only with 2.13: https://issues.apache.org/jira/browse/SPARK-45314

Yes, and we are actually waiting for that release. Once it's out: