vaexio / vaex

Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀
https://vaex.io
MIT License
8.28k stars 590 forks source link

[FEATURE-REQUEST] Vaex All Columns Dynamic Access Support #2109

Closed khanfarhan10 closed 2 years ago

khanfarhan10 commented 2 years ago

Description I wish to aggregate all columns to a single column in vaex.

Something like :

df["combined"] = ",".join(df[reduced_cols])

Is your feature request related to a problem? Please describe. There is no simple way to do this in vaex.

Additional context Can be done in pandas using axis of apply, something like

df["combined"] = df[reduced_cols].apply(
    lambda row: ",".join(row.values.astype(str)), axis=1
)
khanfarhan10 commented 2 years ago

@JovanVeljanoski any idea on this?

JovanVeljanoski commented 2 years ago

Hey,

Vaex in general does not support the axis argument i believe.. so most if not all operations are column oriented (with exception of joins of course).

But there are relatively easy ways to accomplish what you are after. For example, first thing that comes to mind is:

import vaex

df = vaex.example()

# Get all the columns
columns = df.get_column_names()

# Build an expression in a loop
expr = df[columns[0]].astype('string')
for col in columns[1:]:
    expr += df[col].astype('string')

# Assign the expression to the dataframe
df['everything'] = expr

print(df)
JovanVeljanoski commented 2 years ago

Will close due to inactivity. Please reopen if needed

khanfarhan10 commented 2 years ago

Apologies for not replying (must have missed this!!)

I believe that perfectly answers my queries!

Since Vaex will use an expression for storing stuff I believe it is still fast!

maartenbreddels commented 2 years ago

Since Vaex will use an expression for storing stuff I believe it is still fast!

It will not use much memory, but that might make it slow, please consult https://vaex.io/docs/guides/performance.html

Happy to hear it answers you Q!