xorbitsai / xorbits

Scalable Python DS & ML, in an API compatible & lightning fast way.
https://xorbits.readthedocs.io
Apache License 2.0
1.1k stars 67 forks source link

FEAT: Iterable dataset #643

Closed codingl2k1 closed 1 year ago

codingl2k1 commented 1 year ago

What do these changes do?

Implement an iterable dataset to read export dataset, it contains these features.

Related issue number

Fixes #xxxx

Check code requirements

codecov[bot] commented 1 year ago

Codecov Report

Merging #643 (84cb627) into main (b98a556) will increase coverage by 0.03%. The diff coverage is 96.78%.

@@            Coverage Diff             @@
##             main     #643      +/-   ##
==========================================
+ Coverage   93.60%   93.64%   +0.03%     
==========================================
  Files        1024     1025       +1     
  Lines       79432    79707     +275     
  Branches    16475    16534      +59     
==========================================
+ Hits        74353    74638     +285     
+ Misses       3411     3394      -17     
- Partials     1668     1675       +7     
Flag Coverage Δ
unittests 93.53% <96.78%> (+0.03%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
python/xorbits/datasets/iterable_dataset.py 96.70% <96.70%> (ø)
...on/xorbits/datasets/backends/huggingface/export.py 96.17% <100.00%> (+0.04%) :arrow_up:

... and 10 files with indirect coverage changes

aresnow1 commented 1 year ago

Have you tested on cloud storage like s3?

codingl2k1 commented 1 year ago

Have you tested on cloud storage like s3?

Not yet. Do we have a test bucket?

codingl2k1 commented 1 year ago

Have you tested on cloud storage like s3?

I have tested export to and read from s3. It works OK.