Closed Ben-Epstein closed 3 years ago
Hi,
To mix 2 distinct data from 2 vaex DataFrames, you have 1st to join them. "DataFrames" is in my opinion, and somehow, misleading the user in vaex world, or so I think. You should see vaex "DataFrames" as a list of pending commands to be run 'when it will be time'. These 'commands' are "Expression". An "Expression" is only relative to one "DataFrame". It has no meaning in other "DataFrame".
And as stated in add_virtual_column()
, the 2nd expected parameter is an "Expression".
So before anything.
df2 = df2.join(df1)
Bests,
In Vaex, DataFrames are sort of "islands", they do not really interact with each other. Kind of how tables in an SQL database do not interact with each other.
So indeed, as @yohplala said, join is your best bet. You can also join without providing any key, and in that case 2 dataframes will just be put next to each other. That is convenient if you know for example that the ordering is the same, since it is super fast (basically as fast as if you had one big dataframe).
You can also add single rows to a dataframe, but they need to be in memory structures. So you need to do something like external_column = df1['col1'].values
and then df2[external_column] = external_column
. Keep in mind that if you go for this approach, the length of the external column should be the same as the unfiltered length of the target dataframe. You can also see this.
The approach in the OP will not work for the reasons @yohplala already stated. You can think of expressions in vaex as a mathematical expression a+b
. It is stored as such (as a formula or a command) until it needs to be executed. Sometimes we call "columns" those data that do exist on disk ready to use (or in memory). But expression more general, as it can just point to data that is in memory or on disk, or can be a mathematical expression ready to be executed to get the results.
Also for small(?) questions, espcially use based, maybe you can join slack?
@JovanVeljanoski I'd love to join the slack. I didn't know there was one, where is it mentioned?
Thank you for reaching out and helping us improve Vaex!
Description If I have 2 dataframes,
df1
anddf2
and I want the output of an expression ofdf1
to be applied todf2
, I currently cannot.If I run the same thing but change that second to last line to
this works but not as expected. Upon an export, the resulting value is gone, and a print of
df2.virtual_columns
is empty. I think it's attaching it as a simple python attribute, not an actual vaex column.I also tried something like
but that gave me a massive stack trace ending in
same with
The only solution I can think to do for now is something like
which is of course not ideal.
Do you have any suggestions/workarounds?
Thanks!
Software information
import vaex; vaex.__version__)
:pip