A clear and concise description of what the bug is.
when I try to read a large parquet file using pd.read_parquet('my_large_file.pqt') it generates the below stack trace. I know it fits in memory because pandas can read it albeit slowly. The files are between 4.5 GB and 1.5 GB in size.
2023-10-02 14:15:39,732 xorbits._mars.deploy.oscar.local 25232 WARNING Web service started at http://127.0.0.1:59977
0%| | 0.00/100 [00:00<?, ?it/s]2023-10-02 14:15:39,929 xorbits._mars.services.scheduling.worker.execution 25232 ERROR Failed to run subtask eDnnloDuO0VyUkOf3tneQUCY on band numa-0
Traceback (most recent call last):
File "C:\Users\kngka\miniconda3\envs\agd\lib\site-packages\xorbits\_mars\services\scheduling\worker\execution.py", line 494, in internal_run_subtask
subtask_info.result = await self._retry_run_subtask(
File "C:\Users\kngka\miniconda3\envs\agd\lib\site-packages\xorbits\_mars\services\scheduling\worker\execution.py", line 618, in _retry_run_subtask
return await _retry_run(subtask, subtask_info, _run_subtask_once)
File "C:\Users\kngka\miniconda3\envs\agd\lib\site-packages\xorbits\_mars\services\scheduling\worker\execution.py", line 192, in _retry_run
raise ex
File "C:\Users\kngka\miniconda3\envs\agd\lib\site-packages\xorbits\_mars\services\scheduling\worker\execution.py", line 154, in _retry_run
return await target_async_func(*args)
File "C:\Users\kngka\miniconda3\envs\agd\lib\site-packages\xorbits\_mars\services\scheduling\worker\execution.py", line 527, in _run_subtask_once
await quota_ref.request_batch_quota(batch_quota_req)
File "xoscar\\core.pyx", line 284, in __pyx_actor_method_wrapper
File "xoscar\\core.pyx", line 287, in xoscar.core.__pyx_actor_method_wrapper
File "C:\Users\kngka\miniconda3\envs\agd\lib\site-packages\xorbits\_mars\services\scheduling\worker\quota.py", line 119, in request_batch_quota
raise ValueError(
ValueError: Cannot allocate quota size 19629902034.0 larger than total capacity 13668492902.
2023-10-02 14:15:39,932 xorbits._mars.services.scheduling.worker.execution 25232 ERROR Failed to run subtask GeGwp96LjmNNdpVR6oS8x325 on band numa-0
Traceback (most recent call last):
File "C:\Users\kngka\miniconda3\envs\agd\lib\site-packages\xorbits\_mars\services\scheduling\worker\execution.py", line 494, in internal_run_subtask
subtask_info.result = await self._retry_run_subtask(
File "C:\Users\kngka\miniconda3\envs\agd\lib\site-packages\xorbits\_mars\services\scheduling\worker\execution.py", line 618, in _retry_run_subtask
return await _retry_run(subtask, subtask_info, _run_subtask_once)
File "C:\Users\kngka\miniconda3\envs\agd\lib\site-packages\xorbits\_mars\services\scheduling\worker\execution.py", line 192, in _retry_run
raise ex
File "C:\Users\kngka\miniconda3\envs\agd\lib\site-packages\xorbits\_mars\services\scheduling\worker\execution.py", line 154, in _retry_run
return await target_async_func(*args)
File "C:\Users\kngka\miniconda3\envs\agd\lib\site-packages\xorbits\_mars\services\scheduling\worker\execution.py", line 527, in _run_subtask_once
await quota_ref.request_batch_quota(batch_quota_req)
File "xoscar\\core.pyx", line 284, in __pyx_actor_method_wrapper
File "xoscar\\core.pyx", line 287, in xoscar.core.__pyx_actor_method_wrapper
File "C:\Users\kngka\miniconda3\envs\agd\lib\site-packages\xorbits\_mars\services\scheduling\worker\quota.py", line 119, in request_batch_quota
raise ValueError(
ValueError: Cannot allocate quota size 26955130596.0 larger than total capacity 13668492902.
To Reproduce
To help us to reproduce this bug, please provide information below:
Your Python version
python 3.10
The version of Xorbits you use
xorbits 0.6.3 pypi_0 pypi
Versions of crucial packages, such as numpy, scipy and pandas
Full stack of the error.
Minimized code to reproduce the error.
Expected behavior
A clear and concise description of what you expected to happen.
Describe the bug
A clear and concise description of what the bug is.
when I try to read a large parquet file using
pd.read_parquet('my_large_file.pqt')
it generates the below stack trace. I know it fits in memory because pandas can read it albeit slowly. The files are between 4.5 GB and 1.5 GB in size.To Reproduce
To help us to reproduce this bug, please provide information below:
python 3.10
xorbits 0.6.3 pypi_0 pypi
Expected behavior
A clear and concise description of what you expected to happen.
Additional context
Add any other context about the problem here.