nlu support on Python 3.8

javierabosch2 commented 4 years ago

Onimport nlu, looks like pyspark/cloudpickle.py is failing with:

TypeError: an interger is required (got type bytes) . On some research, I found this is an issue with running pysark on Python 3.8. I am not sure if this is the only cause, but if it is, i recommend placing a requirements for Python<3.8

alexott commented 4 years ago

It’s coming from pyspark 2.4 that is not running on 3.8... but yes, makes sense to put limit on python version

C-K-Loan commented 4 years ago

Thanks for sharing this issue.

We are looking to fix various versioning issues for the next release

phoebusg commented 4 years ago

Maybe a similar issue on 3.8.0 AppData\Local\Programs\Python\Python38\lib\site-packages\nlu\components\chunkers\ngram\ngram.py", line 11 .setEnableCumulative(False) \ ^ SyntaxError: unexpected EOF while parsing

Contents of ngram.py: import nlu.pipe_components from sparknlp.annotator import *

class NGram: @staticmethod def get_default_model(): return NGramGenerator() \ .setInputCols(["token"]) \ .setOutputCol("ngrams") \ .setN(2) \ .setEnableCumulative(False) \

C-K-Loan commented 4 years ago

Hi @phoebusg , thanks for sharing,

this is an issue with the nlu version 1.0.1 which is fixed in nlu 1.0.2 which will be released today.

phoebusg commented 4 years ago

Thank you C-K, I appreciate the update... looking forward to the release! :)

phoebusg commented 4 years ago

python .\demo.py Please use a Python version with version number SMALLER than 3.8 Python versions equal or higher 3.8 is currently NOT SUPPORTED by NLU

:( too lazy to setup an environment with <3.8-- I'll wait for the next update then.

alexott commented 4 years ago

@phoebusg NLU depends on the Spark 2.4 that doesn't work with Python 3.8...

muhlbach commented 3 years ago

Do you guys have any idea when Python >3.8 will be supported by NLU? I think Spark >3.0 should support Python 3.8. Trying to figure out whether I should set up another environment with an earlier Python version.

C-K-Loan commented 3 years ago

Hi @muhlbach in the upcoming NLU healthcare release we will support Python 3.8 and Spark 3.0. The first release candidate will be available in a few days for testing, stay tuned :)

muhlbach commented 3 years ago

@C-K-Loan thanks for your answer! That’s great. Looking forward to being able to use NLU again on Python 3.8.

C-K-Loan commented 3 years ago

@muhlbach You can now install and test the Spark 3 and Python3.8 compatible NLU release candidate like this

! pip install  pyspark==3.0.1 spark-nlp==3.0.1 nlu==3.0.0rc2

Stay tuned for the full release this week :)

muhlbach commented 3 years ago

@C-K-Loan, thanks! I tried installing the pre-lease and it installed perfectly. However, trying: import nlu pipe = nlu.load('embed_sentence.bert') leads to the following error: Traceback (most recent call last): File "<ipython-input-3-496e026ee915>", line 1, in <module>pipe = nlu.load('embed_sentence.bert') File "/Users/muhlbach/opt/anaconda3/envs/stable/lib/python3.8/site-packages/nlu/__init__.py", line 261, in load pipe.add(nlu_component, nlu_ref) File "/Users/muhlbach/opt/anaconda3/envs/stable/lib/python3.8/site-packages/nlu/pipe/pipeline.py", line 94, in add if hasattr(component.info,'nlu_ref'): nlu_reference = component.info.nlu_ref AttributeError: type object 'NluError' has no attribute 'info'

I'm wondering whether this has something to do with my Java installation?

Here's the version:

java -version openjdk version "13.0.6" 2021-01-19 OpenJDK Runtime Environment Zulu13.37+21-CA (build 13.0.6+5-MTS) OpenJDK 64-Bit Server VM Zulu13.37+21-CA (build 13.0.6+5-MTS, mixed mode)

Btw. I'm running on an Apple M1 chip.

C-K-Loan commented 3 years ago

Hi @muhlbach

Yes you are right, Java 13 is the main problem. You need to make sure Java 8 is the the default java.
new release candidate nr 3 is out pip install pyspark==3.0.1 spark-nlp==3.0.1 nlu==3.0.0rc3
You can test with this colab https://colab.research.google.com/drive/1N8AmbaqgAjyXXgP26B2GWuM7V4fOemNW?usp=sharing

maziyarpanahi commented 3 years ago

Just in case your M1 chip had an issue with Java, you can follow this as well: https://github.com/JohnSnowLabs/spark-nlp/discussions/2282

C-K-Loan commented 3 years ago

NLU 3.0.0 is now released and supports Python 3.8 and Spark 3.1.X and Spark 3.0.X https://github.com/JohnSnowLabs/nlu/releases

Be aware, if you run with a Spark version below 3, you cannot use Python 3.8, since that is only supported in Spark 3+

JohnSnowLabs / nlu

nlu support on Python 3.8 #11