pola-rs / polars

Dataframes powered by a multithreaded, vectorized query engine, written in Rust
https://docs.pola.rs
Other
30.59k stars 1.99k forks source link

Feature request: Add `collect()` to `DataFrame` #19548

Open Chuck321123 opened 4 weeks ago

Chuck321123 commented 4 weeks ago

Description

So collecting a df is used for lazyframes. However, I sometimes run my code in eager mode, and sometimes in lazymode. However, the amount of if-else and try-except functions i have to make in my code makes it exhausting to switch between eager and lazy mode. I would have prefered not to get an AttributeError when running collect() on a eager frame. I can't be the only one wanting this function I believe, or have I missed something?

Example:

import polars as pl

df = pl.DataFrame({"column1": [1, 2, 3]})

df = df.collect()
MarcoGorelli commented 4 weeks ago

hey, thanks for the request

this was discussed previously and rejected, could you search the issue tracker please?

cmdlineluser commented 4 weeks ago
Chuck321123 commented 4 weeks ago

@MarcoGorelli Hmm I see. Although I am open for getting a warning message, or if we explicitly have to pass a keyword argument to make it work

alexander-beedie commented 4 weeks ago

The extreme asymmetry in the pros and cons is what makes this undesirable (imho):