Closed etiennebacher closed 2 months ago
Are you looking for pl$Series(values = data.frame(a = 1))
?
IIUC, the object type is Python-specific, not a real Apache Arrow type (so we don't support it).
Are you looking for
pl$Series(values = data.frame(a = 1))
?
This is equivalent to calling a list:
> pl$Series(values = data.frame(a = 1))
polars Series: shape: (1,)
Series: '' [list[f64]]
[
[1.0]
]
> pl$Series(values = list(a = 1))
polars Series: shape: (1,)
Series: '' [list[f64]]
[
[1.0]
]
Oh, sorry. This is the one. https://github.com/pola-rs/r-polars/blob/3c0d0ec62a86da7fd66ef1afc53df590f384452f/R/as_polars.R#L367-L371
Can we close this now that #1015 has been merged? As I commented, the Object type is for storing Python objects, so I don't see the point in supporting it here. (Since R's list can contain a variety of things, we can always use the base R data.frame if we want to store something that is not supported by Apache Arrow)
As I commented, the Object type is for storing Python objects, so I don't see the point in supporting it here.
That's something worth mentioning in the docs I think. I'll add that in #1014 and close this issue with this PR
Actually it's hard to construct Struct
for Series
:
>>> pl.Series([{"a": 1, "b": ["x", "y"]}, {"a": 2, "b": ["z"]}])
shape: (2,)
Series: '' [struct[2]]
[
{1,["x", "y"]}
{2,["z"]}
]
as_polars_series(
data.frame(a = 1:2, b = list(c("x", "y"), "z"))
)
polars Series: shape: (2,)
Series: '' [struct[3]]
[
{1,"x","z"}
{2,"y","z"}
]
And it doesn't work for DataFrame
:
pl$DataFrame(
data.frame(a = 1)
)
shape: (1, 1)
┌─────┐
│ a │
│ --- │
│ f64 │
╞═════╡
│ 1.0 │
└─────┘
Maybe we should say that we can't reliably create a Struct
from scratch and point towards $to_struct()
instead
Actually it's hard to construct
Struct
forSeries
:
We should use the I()
function to create a list type column with data.frame()
.
Or, we can use tibble::tibble()
or data.table::data.table()
instead.
polars::as_polars_series(
data.frame(a = 1:2, b = list(c("x", "y"), "z"))
)
#> polars Series: shape: (2,)
#> Series: '' [struct[3]]
#> [
#> {1,"x","z"}
#> {2,"y","z"}
#> ]
polars::as_polars_series(
data.frame(a = 1:2, b = I(list(c("x", "y"), "z")))
)
#> polars Series: shape: (2,)
#> Series: '' [struct[2]]
#> [
#> {1,["x", "y"]}
#> {2,["z"]}
#> ]
polars::as_polars_series(
tibble::tibble(a = 1:2, b = list(c("x", "y"), "z"))
)
#> polars Series: shape: (2,)
#> Series: '' [struct[2]]
#> [
#> {1,["x", "y"]}
#> {2,["z"]}
#> ]
polars::as_polars_series(
data.table::data.table(a = 1:2, b = list(c("x", "y"), "z"))
)
#> polars Series: shape: (2,)
#> Series: '' [struct[2]]
#> [
#> {1,["x", "y"]}
#> {2,["z"]}
#> ]
Created on 2024-04-10 with reprex v2.1.0
And it doesn't work for
DataFrame
:
pl$DataFrame()
works like as_polars_df()
when it receives a data.frame.
(I think this behavior is worth removing because I find it confusing, but the point is that data.frame()
works the same way, and in Python, polars.DataFrame.__init__()
will convert a pandas.DataFrame to a polars.DataFame, so this is consistent behavior)
polars::pl$DataFrame(data.frame(a = 1))
#> shape: (1, 1)
#> ┌─────┐
#> │ a │
#> │ --- │
#> │ f64 │
#> ╞═════╡
#> │ 1.0 │
#> └─────┘
polars::pl$DataFrame(a = data.frame(a = 1))
#> shape: (1, 1)
#> ┌───────────┐
#> │ a │
#> │ --- │
#> │ struct[1] │
#> ╞═══════════╡
#> │ {1.0} │
#> └───────────┘
Created on 2024-04-10 with reprex v2.1.0
I don't think we have a way to create
object
andstruct
from our standardc()
andlist()
but maybe I'm missing something?It would be good to have a small table in the docs to show the equivalent (if any) of those: