elastic / eland

Python Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch
Apache License 2.0
635 stars 98 forks source link

pandas_to_eland: Not working. Error:AttributeError: 'Series' object has no attribute 'iteritems' #565

Open tariksetia opened 1 year ago

tariksetia commented 1 year ago

I am using the example from doc string:

import pandas as pd
import eland as ed

pd_df = pd.DataFrame(
        "A": 3.141,
        "B": 1,
        "C": "foo",
        "D": pd.Timestamp("20190102"),
        "E": [1.0, 2.0, 3.0],
        "F": False,
        "G": [1, 2, 3],
        "H": "Long text - to be indexed as es type text",
    index=["0", "1", "2"],

ed_df = ed.pandas_to_eland(
    es_type_overrides={"H": "text"},
)  # index field 'H' as text not keyword

The script fails to run saying:

Traceback (most recent call last):
  File "/Users/tarik.setia/Desktop/es-similarity-search/test.py", line 19, in <module>
    ed_df = ed.pandas_to_eland(
  File "/Users/tarik.setia/Desktop/es-similarity-search/.venv/lib/python3.11/site-packages/eland/etl.py", line 162, in pandas_to_eland
    mapping = FieldMappings._generate_es_mappings(pd_df, es_type_overrides)
  File "/Users/tarik.setia/Desktop/es-similarity-search/.venv/lib/python3.11/site-packages/eland/field_mappings.py", line 549, in _generate_es_mappings
    for column, dtype in dataframe.dtypes.iteritems():
  File "/Users/tarik.setia/Desktop/es-similarity-search/.venv/lib/python3.11/site-packages/pandas/core/generic.py", line 5989, in __getattr__
    return object.__getattribute__(self, name)
AttributeError: 'Series' object has no attribute 'iteritems'

Issue: https://github.com/elastic/eland/blob/main/eland/field_mappings.py#L551C47-L551C56

bartbroere commented 1 year ago

@tariksetia This issue is most likely caused by having a pandas version equal to or larger than 2.0.0 installed. If you remove this version of pandas from your environment, and instead install version 1.5.0 you should be good.

In the meantime, I'm doing some work to start supporting newer versions of Pandas, including fixing the iteritems bug in pull request #593