JohnSnowLabs / nlu

1 line for thousands of State of The Art NLP models in hundreds of languages The fastest and most accurate way to solve text problems.
Apache License 2.0
870 stars 130 forks source link

nlu support on Python 3.8 #11

Closed javierabosch2 closed 3 years ago

javierabosch2 commented 4 years ago

Onimport nlu, looks like pyspark/cloudpickle.py is failing with:

TypeError: an interger is required (got type bytes) . On some research, I found this is an issue with running pysark on Python 3.8. I am not sure if this is the only cause, but if it is, i recommend placing a requirements for Python<3.8

alexott commented 4 years ago

It’s coming from pyspark 2.4 that is not running on 3.8... but yes, makes sense to put limit on python version

C-K-Loan commented 4 years ago

Thanks for sharing this issue.

We are looking to fix various versioning issues for the next release

phoebusg commented 4 years ago

Maybe a similar issue on 3.8.0 AppData\Local\Programs\Python\Python38\lib\site-packages\nlu\components\chunkers\ngram\ngram.py", line 11 .setEnableCumulative(False) \ ^ SyntaxError: unexpected EOF while parsing

Contents of ngram.py: import nlu.pipe_components from sparknlp.annotator import *

class NGram: @staticmethod def get_default_model(): return NGramGenerator() \ .setInputCols(["token"]) \ .setOutputCol("ngrams") \ .setN(2) \ .setEnableCumulative(False) \

C-K-Loan commented 4 years ago

Hi @phoebusg , thanks for sharing,

this is an issue with the nlu version 1.0.1 which is fixed in nlu 1.0.2 which will be released today.

phoebusg commented 4 years ago

Thank you C-K, I appreciate the update... looking forward to the release! :)

phoebusg commented 4 years ago

python .\demo.py Please use a Python version with version number SMALLER than 3.8 Python versions equal or higher 3.8 is currently NOT SUPPORTED by NLU

:( too lazy to setup an environment with <3.8-- I'll wait for the next update then.

alexott commented 4 years ago

@phoebusg NLU depends on the Spark 2.4 that doesn't work with Python 3.8...

muhlbach commented 3 years ago

Do you guys have any idea when Python >3.8 will be supported by NLU? I think Spark >3.0 should support Python 3.8. Trying to figure out whether I should set up another environment with an earlier Python version.

C-K-Loan commented 3 years ago

Hi @muhlbach in the upcoming NLU healthcare release we will support Python 3.8 and Spark 3.0. The first release candidate will be available in a few days for testing, stay tuned :)

muhlbach commented 3 years ago

@C-K-Loan thanks for your answer! That’s great. Looking forward to being able to use NLU again on Python 3.8.

C-K-Loan commented 3 years ago

@muhlbach You can now install and test the Spark 3 and Python3.8 compatible NLU release candidate like this

! pip install  pyspark==3.0.1 spark-nlp==3.0.1 nlu==3.0.0rc2

Stay tuned for the full release this week :)

muhlbach commented 3 years ago

@C-K-Loan, thanks! I tried installing the pre-lease and it installed perfectly. However, trying: import nlu pipe = nlu.load('embed_sentence.bert') leads to the following error: Traceback (most recent call last): File "<ipython-input-3-496e026ee915>", line 1, in <module>pipe = nlu.load('embed_sentence.bert') File "/Users/muhlbach/opt/anaconda3/envs/stable/lib/python3.8/site-packages/nlu/__init__.py", line 261, in load pipe.add(nlu_component, nlu_ref) File "/Users/muhlbach/opt/anaconda3/envs/stable/lib/python3.8/site-packages/nlu/pipe/pipeline.py", line 94, in add if hasattr(component.info,'nlu_ref'): nlu_reference = component.info.nlu_ref AttributeError: type object 'NluError' has no attribute 'info'

I'm wondering whether this has something to do with my Java installation?

Here's the version:

java -version openjdk version "13.0.6" 2021-01-19 OpenJDK Runtime Environment Zulu13.37+21-CA (build 13.0.6+5-MTS) OpenJDK 64-Bit Server VM Zulu13.37+21-CA (build 13.0.6+5-MTS, mixed mode)

Btw. I'm running on an Apple M1 chip.

C-K-Loan commented 3 years ago

Hi @muhlbach

maziyarpanahi commented 3 years ago

Just in case your M1 chip had an issue with Java, you can follow this as well: https://github.com/JohnSnowLabs/spark-nlp/discussions/2282

C-K-Loan commented 3 years ago

NLU 3.0.0 is now released and supports Python 3.8 and Spark 3.1.X and Spark 3.0.X https://github.com/JohnSnowLabs/nlu/releases

Be aware, if you run with a Spark version below 3, you cannot use Python 3.8, since that is only supported in Spark 3+