aws / aws-sdk-pandas

pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
https://aws-sdk-pandas.readthedocs.io
Apache License 2.0
3.94k stars 701 forks source link

wr.neptune.flatten_nested_df recursion fails after 1 round #3019

Open oniht opened 1 week ago

oniht commented 1 week ago

Describe the bug

This function fails when recursion is over level deep. This is caused by df = df.reset_index() on line 629 in _neptune.py.

The reason is that columns index and level_0 are added in the first 2 calls of flatten_nested_df and in the 3rd call it gives this error ValueError: cannot insert level_0, already exists.

How to Reproduce

data = {'0': [[{'id': 'AA1-2024-11-21-FOO-BAR', 'label': 'flight', 'properties': ['ACREG', 'ACTYPE']}]]}
df = pd.DataFrame.from_dict(data)
wr.neptune.flatten_nested_df(df)

Expected behavior

Use df = df.reset_index(drop=True) to not insert index into dataframe columns. Expected result:

0_id 0_label 0_properties
0 AA1-2024-11-21-FOO-BAR flight ACREG
0 AA1-2024-11-21-FOO-BAR flight ACTYPE

Your project

No response

Screenshots

No response

OS

Mac

Python version

3.11

AWS SDK for pandas version

3.5.2

Additional context

No response