Open markoshorro opened 3 years ago
Thanks for the report @markoshorro. What's happening is somewhat strange, but I believe it is intended (but definitely could be better documented). For the first case, CZ < Caladin < Corrin < Ix, so with forward filling specified, CZ becomes a NaN column since there is nothing to forward fill. For the second case, Caladin < Corrin < DZ < Ix, so the values from Corrin are forward filled into DZ. For the final case, Caladin < Corrin < Ix < TZ, so the values from Ix are forward filled into TZ.
Thanks, @mzeitlin11, for your prompt and right response. (Just for clarity, I believe you have a small typo in the "(...) second case, Caladin < Corrin < DZ < Ix, (...)" instead of "CZ"; just a minor issue :-)).
Now I do understand. My first thought was that method
in reindex
should affect gaps in new rows but only over existing columns. Documentation describes method
parameter as:
Method to use for filling holes in reindexed DataFrame. Please note: this is only applicable to DataFrames/Series with a monotonically increasing/decreasing index.
So my first thought was that gaps in new columns should not be affected. It could be just my misunderstanding, but I believe some further clarification could be helpful for users. Will prepare a small pull request addressing this issue.
Thanks, @mzeitlin11, for your prompt and right response. (Just for clarity, I believe you have a small typo in the "(...) second case, Caladin < Corrin < DZ < Ix, (...)" instead of "CZ"; just a minor issue :-)).
Will edit, thanks!
Yep a pull request to improve the documentation / give an example would be very welcome (this is definitely confusing, not even 100% sure this behavior is what it should be)
I will dig a bit before writing anything following your examples, but I will definitely improve current documentation clarifying these cases.
[x] I have checked that this issue has not already been reported.
[x] I have confirmed this bug exists on the latest version of pandas.
[ ] (optional) I have confirmed this bug exists on the master branch of pandas.
Code Sample, a copy-pastable example
Problem description
DataFrame.reindex
, when changing columns labeling, if column name is unknown it should not fill that column, even though it does. Besides, it has different behaviors depending on the new column name, which is even stranger.Expected Output
New column not filled; same behavior when varying the name of new column/s.
Output of
pd.show_versions()