vaexio / vaex

Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀
https://vaex.io
MIT License
8.25k stars 590 forks source link

feature: `get_column_names` accepts a dtypes argument #2160

Closed JovanVeljanoski closed 2 years ago

JovanVeljanoski commented 2 years ago

Now we can query column names by data type:

import vaex

df = vaex.datasets.titanic()
df.get_column_names(dtypes=[float])

# returns  ['age', 'fare', 'body']

Several attempts at this have been done at the past. This is my latest proposal and I think it works quite well. Closes:

maartenbreddels commented 2 years ago

See the comments in #1766, mainly

JovanVeljanoski commented 2 years ago

See the comments in #1766, mainly

* dtypes -> dtype

* Possibly put this in the docs?

The idea behind dtypes is that you can put more than one, i.e. a list. Or we can indeed change it to dtype and be more explicit in the docstring that you can also pass a list. Let's go for that.

maartenbreddels commented 2 years ago

Yes, the point is that we always used singular instead of plural.