You see the columns are ["x", "y", "x", "y]
The issue is that whichever x and y came second will be the ones used. So when we rename data_x and data_y, if they were "first" in the dataframe, the rename won't work as expected
What should happen?
Ideally, if the column already exists, it should be renamed to a hidden _column_ and the new one should take over.
But at the minimum, vaex should throw an error that you cannot rename to a column that already exists. One of these, but ideally the first
Software information
Vaex version (import vaex; vaex.__version__): 4.16.0
Vaex was installed via: pip / conda-forge / from source
so this is a tricky little bug
Because we were renaming but not dropping the original columns, sometimes vaex wouldn't overwrite correctly (I'll make an issue in the vaex github).
You can run these to understand the issue fully
This will work as expected. The dataframe will show 2 columns, x, and y, and the values will match that of data_x and data_y
This will fail
The reason has to do with the state. If you look at the
state_get()
of either dataframedf.state_get()
You'll see something like this
You see the columns are
["x", "y", "x", "y]
The issue is that whichever x and y came second will be the ones used. So when we rename data_x and data_y, if they were "first" in the dataframe, the rename won't work as expectedWhat should happen?
Ideally, if the column already exists, it should be renamed to a hidden
_column_
and the new one should take over.But at the minimum, vaex should throw an error that you cannot rename to a column that already exists. One of these, but ideally the first
Software information
import vaex; vaex.__version__)
: 4.16.0