pandas-dev / pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
https://pandas.pydata.org
BSD 3-Clause "New" or "Revised" License
43.43k stars 17.86k forks source link

BUG: Chained loc indexing reversal on dataframe with datetime index has different results than on dataframes with other index types #51156

Open ChrisDarley opened 1 year ago

ChrisDarley commented 1 year ago

Pandas version checks

Reproducible Example

import pandas as pd
import numpy as np
a = pd.DataFrame(
 data=np.array([[1,2,3,4],[10,20,30,40]]).transpose(),
 index = pd.date_range(start='2020-01-01', periods=4,freq='D'))
print(a.loc[::-1].loc[::-1])

Issue Description

A common way to reverse dataframes is by using df.loc[::-1]. If the dataframe has a datetime index, and you create a reversed view of the dataframe using df.loc[::-1], if you reverse that view again (i.e. df.loc[::-1].loc[::-1]), it only returns the last row of the original dataframe. This issue does not happen if you use df.iloc[::-1].iloc[::-1], nor does it happen with basic indexing such as df[::-1][::-1]. I think that this issue is specific to datetime indexes, because it did not occur when doing df.loc[::-1].loc[::-1] when I had a string index. I'm not sure that df.loc[::-1] is the preferred way to reverse a dataframe, but if double reversal works on a dataframe with a string index it should theoretically work on a a dataframe with a datetime index.

Expected Behavior

The same operation works fine on a dataframe with a string index

import pandas as pd import numpy as np a = pd.DataFrame( data=np.array([[1,2,3,4],[10,20,30,40]]).transpose(), index = ['a', 'b', 'c', 'd']) print(a.loc[::-1].loc[::-1])

Installed Versions

INSTALLED VERSIONS ------------------ commit : 2e218d10984e9919f0296931d92ea851c6a6faf5 python : 3.10.6.final.0 python-bits : 64 OS : Linux OS-release : 5.15.0-53-generic Version : #59-Ubuntu SMP Mon Oct 17 18:53:30 UTC 2022 machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : en_US.UTF-8 pandas : 1.5.3 numpy : 1.24.1 pytz : 2022.7 dateutil : 2.8.2 setuptools : 59.6.0 pip : 22.0.2 Cython : None pytest : None hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : 4.9.2 html5lib : 1.1 pymysql : None psycopg2 : None jinja2 : 3.1.2 IPython : 8.7.0 pandas_datareader: None bs4 : 4.11.1 bottleneck : None brotli : None fastparquet : None fsspec : None gcsfs : None matplotlib : 3.6.2 numba : None numexpr : None odfpy : None openpyxl : None pandas_gbq : None pyarrow : None pyreadstat : None pyxlsb : None s3fs : None scipy : 1.9.3 snappy : None sqlalchemy : 1.4.45 tables : None tabulate : None xarray : None xlrd : None xlwt : None zstandard : None tzdata : None
phofl commented 1 year ago

Hi, thanks for your report. I think this is expected. slicing on DatetimeIndexes is only supposed to work on monotonic indexes. cc @jbrockmendel

jbrockmendel commented 1 year ago

Do they need to be monotonic increasing or any monotonic? seems like ::-1 is pretty unambiguous as to what we'd expect

phofl commented 1 year ago

Sorry, should have been more clear: monotonic increasing. We could certainly make this work, but the whole conversion into an indexer assumes monotonic increasing indexes

topper-123 commented 1 year ago

If it doesn't/shouldn't work, then the operation should raise, e.g. a KeyError?