JohnSnowLabs / nlu

1 line for thousands of State of The Art NLP models in hundreds of languages The fastest and most accurate way to solve text problems.
Apache License 2.0
839 stars 129 forks source link

Breaking dependencies #198

Closed CallMarl closed 9 months ago

CallMarl commented 10 months ago

Hello I'am trying to run your lab into wsl but an error occure with dependencies. The full trace bellow:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/callmarl/workzone/nlp/env/lib/python3.9/site-packages/nlu/pipe/pipeline.py", line 468, in predict
    return __predict__(self, data, output_level, positions, keep_stranger_features, metadata, multithread,
  File "/home/callmarl/workzone/nlp/env/lib/python3.9/site-packages/nlu/pipe/utils/predict_helper.py", line 166, in __predict__
    pipe.fit()
  File "/home/callmarl/workzone/nlp/env/lib/python3.9/site-packages/nlu/pipe/pipeline.py", line 202, in fit
    self.vanilla_transformer_pipe = self.spark_estimator_pipe.fit(self.get_sample_spark_dataframe())
  File "/home/callmarl/workzone/nlp/env/lib/python3.9/site-packages/nlu/pipe/pipeline.py", line 101, in get_sample_spark_dataframe
    return sparknlp.start().createDataFrame(data=text_df)
  File "/home/callmarl/workzone/nlp/env/lib/python3.9/site-packages/pyspark/sql/session.py", line 673, in createDataFrame
    return super(SparkSession, self).createDataFrame(
  File "/home/callmarl/workzone/nlp/env/lib/python3.9/site-packages/pyspark/sql/pandas/conversion.py", line 299, in createDataFrame
    data = self._convert_from_pandas(data, schema, timezone)
  File "/home/callmarl/workzone/nlp/env/lib/python3.9/site-packages/pyspark/sql/pandas/conversion.py", line 331, in _convert_from_pandas
    for column, series in pdf.iteritems():
  File "/home/callmarl/workzone/nlp/env/lib/python3.9/site-packages/pandas/core/generic.py", line 6202, in __getattr__
    return object.__getattribute__(self, name)
AttributeError: 'DataFrame' object has no attribute 'iteritems'
callmarl@LAPTOP-QS9M6N2F ~/workzone/nlp % python --version
Python 3.9.2
callmarl@LAPTOP-QS9M6N2F ~/workzone/nlp % pip freeze
asttokens==2.4.0
backcall==0.2.0
certifi==2023.7.22
charset-normalizer==3.2.0
click==8.1.7
colorama==0.4.6
databricks-api==0.9.0
databricks-cli==0.17.7
dataclasses==0.6
decorator==5.1.1
exceptiongroup==1.1.3
executing==1.2.0
idna==3.4
ipython==8.15.0
jedi==0.19.0
johnsnowlabs==5.0.7
matplotlib-inline==0.1.6
nlu==5.0.0
numpy==1.25.2
oauthlib==3.2.2
pandas==2.1.0
parso==0.8.3
pexpect==4.8.0
pickleshare==0.7.5
pkg_resources==0.0.0
prompt-toolkit==3.0.39
ptyprocess==0.7.0
pure-eval==0.2.2
py4j==0.10.9
pyarrow==13.0.0
pydantic==1.10.11
Pygments==2.16.1
PyJWT==2.8.0
pyspark==3.1.2
python-dateutil==2.8.2
pytz==2023.3.post1
requests==2.31.0
six==1.16.0
spark-nlp==5.0.2
spark-nlp-display==4.1
stack-data==0.6.2
svgwrite==1.4
tabulate==0.9.0
traitlets==5.9.0
typing_extensions==4.7.1
tzdata==2023.3
urllib3==1.26.16
wcwidth==0.2.6
C-K-Loan commented 10 months ago

Hi @CallMarl if you downgrade Pandas below 2.0 you will get around that issue. like pip install pandas==1.5.3 we are working on a fix for that

C-K-Loan commented 9 months ago

fixed in nlu 502 https://github.com/JohnSnowLabs/nlu/pull/206