Open ivan-marroquin opened 5 days ago
breaking change : #55948
@ivan-marroquin
It does not look like this is supported. Two things -
id_vars
is no longer checked against a flattened index. (changes in https://github.com/pandas-dev/pandas/pull/55948). For multiIndex, this must be a list of tuples.var_name
name must be a scalar as mentioned in the docs. (enforeced in https://github.com/pandas-dev/pandas/pull/55948) It looks like for a multiIndex, the var_names are determined from the index names and cannot be specified. Therefore, following might be the correct way to go about this -
data.columns.names = ['Attribute', 'Ticker']
df.melt(id_vars= [('Date', '')]).rename(columns={('Date', ''): 'Date'})
Output -
Date Attribute Ticker value
0 2023-10-12 00:00:00+00:00 Adj Close HDFCBANK.NS 1.528971e+03
1 2023-10-13 00:00:00+00:00 Adj Close HDFCBANK.NS 1.515061e+03
2 2023-10-16 00:00:00+00:00 Adj Close HDFCBANK.NS 1.508994e+03
3 2023-10-17 00:00:00+00:00 Adj Close HDFCBANK.NS 1.520438e+03
4 2023-10-18 00:00:00+00:00 Adj Close HDFCBANK.NS 1.499277e+03
... ... ... ... ...
5851 2024-10-04 00:00:00+00:00 Volume TCS.NS 2.965463e+06
5852 2024-10-07 00:00:00+00:00 Volume TCS.NS 1.472619e+06
5853 2024-10-08 00:00:00+00:00 Volume TCS.NS 1.541867e+06
5854 2024-10-09 00:00:00+00:00 Volume TCS.NS 1.082504e+06
5855 2024-10-10 00:00:00+00:00 Volume TCS.NS 2.378875e+06
[5856 rows x 4 columns]
Pandas version checks
[X] I have checked that this issue has not already been reported.
[X] I have confirmed this bug exists on the latest version of pandas.
[X] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
Issue Description
KeyError Traceback (most recent call last) Cell In [19], line 15 11 data= data.reset_index() 13 # melt the DataFrame to make it long format where each row is a 14 # unique combination of Date, Ticker, and attributes ---> 15 data_melted= data.melt(id_vars= ['Date'], var_name= ['Attribute', 'Ticker']) 17 # pivot the melted DataFrame to have the attributes (Open, High, Low, etc.) as columns 18 data_pivoted= data_melted.pivot_table(index= ['Date', 'Ticker'], 19 columns= 'Attribute', values= 'value', 20 aggfunc= 'first')
File ~/python_3.9.0/lib/python3.9/site-packages/pandas/core/frame.py:9942, in DataFrame.melt(self, id_vars, value_vars, var_name, value_name, col_level, ignore_index) 9932 @Appender(_shared_docs["melt"] % {"caller": "df.melt(", "other": "melt"}) 9933 def melt( 9934 self, (...) 9940 ignore_index: bool = True, 9941 ) -> DataFrame: -> 9942 return melt( 9943 self, 9944 id_vars=id_vars, 9945 value_vars=value_vars, 9946 var_name=var_name, 9947 value_name=value_name, ... 77 ) 78 if value_vars_was_not_none: 79 frame = frame.iloc[:, algos.unique(idx)]
KeyError: "The following id_vars or value_vars are not present in the DataFrame: ['Date']"
Expected Behavior
The melt function should generate a DataFrame to make it long format where each row is a unique combination of Date, Ticker, and attributes
Installed Versions