dask / dask-expr

BSD 3-Clause "New" or "Revised" License
79 stars 18 forks source link

Running into issues computing indices with loc #1036

Closed ayushdg closed 2 months ago

ayushdg commented 2 months ago

Describe the issue: Running into errors when trying to compute indices on df's sliced with loc Minimal Complete Verifiable Example:

pdf = pd.DataFrame({"a":[1,2,3],"b":[4,5,6]})
df = dd.from_pandas(pdf, 2)
df.loc[2].index.compute()
     50 if cindexer is None:
---> 51     return df.loc[iindexer]
     52 else:
     53     return df.loc[iindexer, cindexer]

AttributeError: 'tuple' object has no attribute 'loc'

Anything else we need to know?: Other combinations of loc that touch the first partition seem to work without issues:

df.loc[1].index.compute() # works
df.loc[1:3].index.compute() # works
df.loc[2:].index.compute() # fails

Environment:

phofl commented 2 months ago

Thx for the report, put up a pr to fix

phofl commented 2 months ago

I pushed out a release with a fix. It's already on pypi and will be on conda forge soon