pola-rs / r-polars

Polars R binding
https://pola-rs.github.io/r-polars/
Other
472 stars 36 forks source link

It's possible to read from pipe? #273

Open Liripo opened 1 year ago

Liripo commented 1 year ago

I want to read data from pipe is it possible.like this

readLines(pipe("echo a,b"))
# for example
pl$lazy_csv_reader(pipe("echo a,b"))
sorhawell commented 1 year ago

Looking csv readers up again. It appears to me py-polars neither supports connections. If input is StringIO in py-polars it is mapped to memory and used as literal path. Input has to be path to some file.

You could point to a "named pipe" but that is not seekable and your OS will likely throw an error via polars like below

# the shell terminal
mkfifo my_pipe
# R session in another terminal
> pl$lazy_csv_reader("my_pipe")
#waiting ...
# shell terminal 
echo a,b > my_pipe
# R session in the other terminal
Error: in rlazy_csv_reader: Io(Os { code: 29, kind: NotSeekable, message: "Illegal seek" }) When calling:  ...

https://unix.stackexchange.com/questions/30885/is-it-possible-to-make-seek-operations-on-a-named-pipe-return-successful

We do have pl$scan_arrow_ipc() for inter process communications but that is maybe not what you were hoping for.

you will likely have to sink the pipe to a file first convert or convert it into an arrow_ipc file.

eitsupi commented 1 year ago

arrow::read_csv_arrow seems supporting this.

arrow::read_csv_arrow(pipe("echo 'a,b
1,2'"))
#>   a b
#> 1 1 2

Created on 2023-07-07 with reprex v2.0.2

So I think we can use this workaround.

polars::pl$from_arrow(arrow::read_csv_arrow(pipe("echo 'a,b
1,2'"), as_data_frame = FALSE))
#> shape: (1, 2)
#> ┌─────┬─────┐
#> │ a   ┆ b   │
#> │ --- ┆ --- │
#> │ i64 ┆ i64 │
#> ╞═════╪═════╡
#> │ 1   ┆ 2   │
#> └─────┴─────┘

Created on 2023-07-07 with reprex v2.0.2