xorbitsai / xorbits

Scalable Python DS & ML, in an API compatible & lightning fast way.
https://xorbits.readthedocs.io
Apache License 2.0
1.11k stars 67 forks source link

BUG: Integrated pandas can't Read CSV while latest pandas can #730

Open charliedream1 opened 1 year ago

charliedream1 commented 1 year ago

Describe the bug

To Reproduce

To help us to reproduce this bug, please provide information below:

  1. Your Python version: 3.10
  2. The version of Xorbits you use: 0.6.3
  3. Versions of crucial packages, such as numpy, scipy and pandas: numpy 1.26.0, scipy 1.11.3, pandas 2.1.1
  4. Full stack of the error.
  5. Minimized code to reproduce the error.

Expected behavior

A clear and concise description of what you expected to happen.

Additional context

Add any other context about the problem here.

codingl2k1 commented 12 months ago

Problem 1: Is your csv file located in local disk or remote (by a url)? Probelm 2: Are you using pandas to load the csv and constructing a xorbit Dataframe by the pandas Dataframe? If so, it could be out of memory crash, because the full data will be serilialized to worker. Problem 3: The too many open files can be fixed by configure the ulimit.