pola-rs / polars

Dataframes powered by a multithreaded, vectorized query engine, written in Rust
https://docs.pola.rs
Other
30.1k stars 1.94k forks source link

writing to os.devnull #17662

Open franzhaas opened 3 months ago

franzhaas commented 3 months ago

Checks

Reproducible example

import os
import polars as pl
pl.DataFrame().write_excel(os.devnull)

Log output

FileCreateError: [Errno 13] Permission denied: '/dev/null.xlsx'

Issue description

I cant use os.devnull to write an xlsx into "nowhere", this does work for parquet ipc..

Expected behavior

I expect to be able to write into "nowhere"

Installed versions

``` --------Version info--------- Polars: 1.1.0 Index type: UInt32 Platform: Linux-6.5.0-21-generic-x86_64-with-glibc2.35 Python: 3.10.12 (main, Mar 22 2024, 16:50:05) [GCC 11.4.0] ----Optional dependencies---- adbc_driver_manager: cloudpickle: connectorx: deltalake: fastexcel: 0.10.4 fsspec: gevent: great_tables: hvplot: matplotlib: nest_asyncio: numpy: 2.0.0 openpyxl: pandas: 2.2.2 pyarrow: 16.1.0 pydantic: pyiceberg: sqlalchemy: torch: xlsx2csv: xlsxwriter: 3.2.0 ```
anergictcell commented 3 months ago

This is caused by the _xl_setup_workbook method:

https://github.com/pola-rs/polars/blob/1df3b0b7eea418b7f7878c8ddbec1699143becb7/py-polars/polars/io/spreadsheet/_write_utils.py#L538-L539

I can think of two possible options to fix this:

  1. Don't append a suffix if it's missing. This might break some existing workflows, so most likely not desired
  2. Add an argument for suffix addition to write_excel, that defaults to xlsx, such as fallback_suffix=".xlsx". That would keep existing functionality unchanged and users could set it to an empty string if they don't want to add a suffix.

I could implement on of these options and update the documentation, if desired.