delta-io / delta-sharing

An open protocol for secure data sharing
https://delta.io/sharing
Apache License 2.0
722 stars 154 forks source link

Critical error in python API, killed when subtables are empty #452

Open lz100 opened 5 months ago

lz100 commented 5 months ago

When I use the python API to download some tables, this happed:

xxx/.venv/lib/python3.10/site-packages/delta_sharing/reader.py:123: FutureWarning: 
The behavior of DataFrame concatenation with empty or all-NA entries is deprecated. 
In a future version, this will no longer exclude empty or all-NA columns when determining 
the result dtypes. To retain the old behavior, exclude the relevant entries before the concat operation.
Killed

It happen on this line code in your API:

        merged = pd.concat(
            pdfs,
            axis=0,
            ignore_index=True,
            copy=False,
        )

It seems some of the pfds are empty. Some of other tables worked, but this one had this error. I tried a few times with the same thing. This table is managed by an outside company so I have no idea what's in there. It was killed before I get a chance to see the content.

I was using V1.03 pandas 2.14