Closed GonzaloRuizGit closed 2 years ago
Hi,
Environment
section and fill it in?Hi maziyarpanahi, thanks for your early reply. I have edited the first comment. What do you mean by enviroment?
Thank you, the following information is very important. (your first error seems to be you used a spark-nlp package that was for spark 3.0/3.1 on spark 3.2, the second one seems to be a mismatch between your PyPI spark-nlp
and the Maven. So knowing the following and everything else would be very helpful)
sparknlp.version()
:spark.version
:java -version
:Since I am using AWS, I don't know if the right way to know the Java version or the setup version. So if you see something strange in the versions I gave you, I can double check that.
Very nice! It's enough to resolve the first issue for sure. For Apache Spark 3.2.x you need a different Spark NLP package from Maven:
com.johnsnowlabs.nlp:spark-nlp-spark32_2.12:3.4.1
com.johnsnowlabs.nlp:spark-nlp-gpu-spark32_2.12:3.4.1
Reference:
For the second issue, please make sure both Maven and PyPI are the same versions, (you mentioned 3.4.1, but your code says 3.1.2, let's make sure they are both 3.4.1)
pip install spark-nlp==3.4.1
com.johnsnowlabs.nlp:spark-nlp-spark32_2.12:3.4.1
Thank you very much, As you said, the problem was the spark-nlp version.
I'm trying to run a simple example using a pre-trained pipeline from the Spark NLP library. I get an error when I'm downloading the pipeline:
code
error
Then I tried to download the model and load it from disk using sparknlp.start(), which it turned out to work properly:
code
but the problem with this method is that i am unable to read a .parquet from S3 AWS.
The final model i tried, also has an error:
code
error