-
I am slightly confused about how to use Vaex.ml with "out of memory" dataframes?
I can very nicely use Vaex to create the 100 million rows training data and split it into test train as everything see…
-
Hi,
I've been using vaex to create a hdf5 file, open it and then applying filters multiple time (in a loop) for data extraction from the large hdf5 file (~3GB). The filtering and extraction process t…
-
Error1: I do read a .hdf5 file from s3 to a VM everyday for further use cases, however even after deleting the file from s3 am still able to read the file with complete data.
does Vaex cache the dat…
-
**Description**
I am trying to save the result of `diff()` in another column of my vDataFrame* and behavior is surprising me.
(* vDataFrame: vaex DataFrame or virtual DataFrame, as you feel it ;))…
-
I am using Vaex as my dataset is huge(45 GB). I was trying to use sklearn.feature_extraction.text.TfidfVectorizer but could not on vaex dataframe.
Can anyone help me to create a TF-IDF matrix using…
-
Consider the following snippet:
```
# -*- coding: utf-8 -*-
import pandas as pd
import vaex
import numpy as np
import time
dates = pd.date_range("01-01-2019", "14-04-2020", freq="50S")
num…
-
I am looking to replace values which are not in the mapper dictionary with a np.nan
Here is the reproducible example:
```python
df = vaex.from_arrays(color=['red', 'red', 'blue', 'red', 'green', 'n…
-
**Description**
vaex.concat([ori_df,add_df]) and ori_df.concat(add_df) when add_df contains only one row failed.
If I let add_df contains 2 rows, there would be no error.
>
```
>>>ori_df
# …
-
Not sure if this is a bug or an intentional restriction, but it's not spelled out in the documentation.
You *can* convert these to concatenated dataframes, but they break badly when you actually tr…
-
**Description**
If we try to group by one column that contains missing values, the result is erratic.
Running
```
df1 = vaex.from_pandas(pd.DataFrame([1,2,3], columns=['idx']))
df2 = vaex.from…