delta-io / delta-rs

A native Rust library for Delta Lake, with bindings into Python
https://delta-io.github.io/delta-rs/
Apache License 2.0
2.14k stars 380 forks source link

Windows paths: spaces are parsed to %20 (reading delta table) #1391

Open ABChristian opened 1 year ago

ABChristian commented 1 year ago

Environment

Delta-rs version: Python deltalake 0.9.0

Environment:


Bug

What happened: Reading a delta table inside a folder which contains spaces thorws an error. A new folder containing "%20" instead of spaces is created.

This folder/delta table was written using deltalake, and created in the expected location.

What you expected to happen: The table should be created and read in the same folder.

How to reproduce it: `import deltalake as dl

dl.write_deltalake(table_or_uri="./deltalake/data",data=DATAFRAME, mode="append") data = dl.DeltaTable(table_uri="./deltalake/data") print(data.to_pandas())`

More details: File "(...)\Lib\site-packages\deltalake\table.py", line 442, in to_pandas return self.to_pyarrow_table( ^^^^^^^^^^^^^^^^^^^^^^ File "(...)\Lib\site-packages\deltalake\table.py", line 426, in to_pyarrow_table ).to_table(columns=columns) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "pyarrow_dataset.pyx", line 546, in pyarrow._dataset.Dataset.to_table File "pyarrow_dataset.pyx", line 3449, in pyarrow._dataset.Scanner.to_table File "pyarrow\error.pxi", line 144, in pyarrow.lib.pyarrow_internal_check_status File "pyarrow_fs.pyx", line 1551, in pyarrow._fs._cb_open_input_file File "(...)\Lib\site-packages\deltalake\fs.py", line 22, in open_input_file return pa.PythonFile(DeltaFileSystemHandler.open_input_file(self, path)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ deltalake.PyDeltaTableError: Object at location C:\Folders\Prefix%20-%20Suffix\deltalake\data\0-8162c431-b69c-4931-8618-75de0202cbc6-0.parquet not found: (...) (os error 2)

djouallah commented 1 year ago

same issue here

Phil-T1 commented 11 months ago

Using 0.10.2, I'm having this same issue where Windows paths cannot be decoded due to their spaceyness.

Writing only succeeds where paths which have no space characters. :(

JvdH-NL commented 4 months ago

Using 0.16.4 I am having same issue. Running Python code from a path that contains spaces gives me 'object at location <FILL PATH WITH SPACES that have been replaced with %20 .... parquet file> not found .... I have some sample data, using code, fails on last statement. And path is on corporate OneDrive containing spaces. delta_table_path = 'deltaTable/' dt = DeltaTable(delta_table_path)

Read Data from Delta table

dt.to_pandas()

JvdH-NL commented 4 months ago

Quickstart on Homepage results in error in reading (writing part goes ok), same issue as my previous post.