uber / petastorm

Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
Apache License 2.0
1.8k stars 284 forks source link

Support passing multiple url files to make_reader function. #731

Closed selitvin closed 2 years ago

selitvin commented 2 years ago

Resolves #728.

chongxiaoc commented 2 years ago

what's wrong with CI recently?

selitvin commented 2 years ago

what's wrong with CI recently?

I should've have added you as a reviewer before making CI pass. Sorry. I think this is my fault specifically.

codecov[bot] commented 2 years ago

Codecov Report

Merging #731 (920a4c9) into master (54e6bc2) will increase coverage by 0.08%. The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #731      +/-   ##
==========================================
+ Coverage   86.18%   86.27%   +0.08%     
==========================================
  Files          85       85              
  Lines        5082     5084       +2     
  Branches      787      787              
==========================================
+ Hits         4380     4386       +6     
+ Misses        561      559       -2     
+ Partials      141      139       -2     
Impacted Files Coverage Δ
petastorm/py_dict_reader_worker.py 95.45% <100.00%> (+0.06%) :arrow_up:
petastorm/reader.py 90.69% <100.00%> (+0.93%) :arrow_up:
petastorm/spark/spark_dataset_converter.py 91.33% <0.00%> (+0.72%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 54e6bc2...920a4c9. Read the comment docs.