When trying to read a parquet file from a zipped folder using zipFile it failed whereas pandas has no issues.
To Reproduce
To help us to reproduce this bug, please provide information below:
Your Python version
Python version : 3.10.10
IPython version : 8.13.2
The version of Xorbits you use
xorbits : 0.4.4
Versions of crucial packages, such as numpy, scipy and pandas
numpy : 1.23.5
pandas : 1.5.2
Full stack of the error.
Traceback (most recent call last):
File "C:\Users\kngka\Anaconda3\envs\algodev\lib\site-packages\IPython\core\interactiveshell.py", line 3508, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-2-66890dd066a4>", line 1, in <module>
runfile('D:\\PERSONAL\\CODE_PROJECTS\\blackarbs_algo_strategy_dev-master\\scripts\\data_exploration.py', wdir='D:\\PERSONAL\\CODE_PROJECTS\\blackarbs_algo_strategy_dev-master\\scripts\\)
File "C:\Program Files\JetBrains\PyCharm Community Edition 2020.1.2\plugins\python-ce\helpers\pydev\_pydev_bundle\pydev_umd.py", line 198, in runfile
pydev_imports.execfile(filename, global_vars, local_vars) # execute the script
File "C:\Program Files\JetBrains\PyCharm Community Edition 2020.1.2\plugins\python-ce\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "D:\PERSONAL\CODE_PROJECTS\blackarbs_algo_strategy_dev-master\scripts\data_exploration.py", line 124, in <module>
df = get_symbol_dataframe_from_zip(zip_file_path, symbol)
File "D:\PERSONAL\CODE_PROJECTS\blackarbs_algo_strategy_dev-master\scripts\data_exploration.py", line 94, in get_symbol_dataframe_from_zip
out = read_parquet_files_from_zip(zip_file, filenames, symbol)
File "C:\Users\kngka\Anaconda3\envs\algodev\lib\site-packages\blk_utils\utils.py", line 37, in wrap_func
result = func(*args, **kwargs)
File "D:\PERSONAL\CODE_PROJECTS\blackarbs_algo_strategy_dev-master\scripts\data_exploration.py", line 76, in read_parquet_files_from_zip
df = pd.read_parquet(parquetfile)
File "C:\Users\kngka\Anaconda3\envs\algodev\lib\site-packages\xorbits\core\adapter.py", line 472, in wrapped
return from_mars(c(*to_mars(args), **to_mars(kwargs)))
File "C:\Users\kngka\Anaconda3\envs\algodev\lib\site-packages\xorbits\_mars\dataframe\datasource\read_parquet.py", line 752, in read_parquet
fs = get_fs(single_path, storage_options)
File "C:\Users\kngka\Anaconda3\envs\algodev\lib\site-packages\xorbits\_mars\lib\filesystem\core.py", line 53, in get_fs
scheme = get_scheme(path)
File "C:\Users\kngka\Anaconda3\envs\algodev\lib\site-packages\xorbits\_mars\lib\filesystem\core.py", line 40, in get_scheme
if os.path.exists(path) or glob_.glob(path):
File "C:\Users\kngka\Anaconda3\envs\algodev\lib\genericpath.py", line 19, in exists
os.stat(path)
TypeError: stat: path should be string, bytes, os.PathLike or integer, not ZipExtFile
Minimized code to reproduce the error.
with zipfile.ZipFile(zip_file) as zip:
for parquet_file in parquet_files:
with zip.open(parquet_file, "r") as parquetfile:
df = pd.read_parquet(parquetfile)
Expected behavior
A clear and concise description of what you expected to happen.
Describe the bug
When trying to read a parquet file from a zipped folder using zipFile it failed whereas pandas has no issues.
To Reproduce
To help us to reproduce this bug, please provide information below:
xorbits : 0.4.4
Expected behavior
A clear and concise description of what you expected to happen.
Additional context
Add any other context about the problem here.