Open phofl opened 1 week ago
cc @jbrockmendel thoughts here?
I don't have an ubuntu env ATM, so speculating: sort_values go through nargsort which I suspect may have undefined behavior in non-unique cases.
BTW the example in the OP is not reproducible, needs an index passed to DataFrame.
@phofl pandas has historically had issues with preserving the freq attribute of DatetimeIndex after certain operations. The sort_values method should ideally not affect the freq attribute, but inconsistencies can arise due to differences in underlying implementations or versions. u can try this code:-
import pandas as pd
# Create a DataFrame with a single column
pdf = pd.DataFrame({"z": 1}, index=pd.date_range(start='2020-12-31', periods=100, freq='D'))
# Sort the values in the column
sorted_values = pdf.z.sort_values()
# Restore the frequency attribute
sorted_values.index.freq = pdf.index.freq
# Print the sorted values
print(sorted_values)
concl:- the behavior of pandas should ideally be consistent across platforms, there are occasional discrepancies. By explicitly managing the freq attribute, you can ensure your code behaves as expected regardless of the operating system.
Pandas version checks
[X] I have checked that this issue has not already been reported.
[X] I have confirmed this bug exists on the latest version of pandas.
[ ] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
Issue Description
This preserves the freq on ubuntu but loses it on Mac
Expected Behavior
don't know, consistency above everything else
I can't test the nightlies on ubuntu right now, that's why it's only 2.2.2
Installed Versions