uber / petastorm

Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
Apache License 2.0
1.8k stars 284 forks source link

[ML-10118] Preserve spark dataframe schema order when create petastorm dataset/dataloader #513

Closed WeichenXu123 closed 4 years ago

WeichenXu123 commented 4 years ago
codecov[bot] commented 4 years ago

Codecov Report

Merging #513 into master will increase coverage by 0.10%. The diff coverage is 77.77%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #513      +/-   ##
==========================================
+ Coverage   86.02%   86.13%   +0.10%     
==========================================
  Files          81       81              
  Lines        4402     4435      +33     
  Branches      704      713       +9     
==========================================
+ Hits         3787     3820      +33     
  Misses        504      504              
  Partials      111      111              
Impacted Files Coverage Δ
petastorm/reader.py 90.99% <ø> (ø)
petastorm/transform.py 85.18% <60.00%> (-14.82%) :arrow_down:
petastorm/unischema.py 94.71% <100.00%> (+0.13%) :arrow_up:
petastorm/spark/spark_dataset_converter.py 92.73% <0.00%> (+2.11%) :arrow_up:
petastorm/pytorch.py 92.68% <0.00%> (+2.43%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 0b70510...08329f1. Read the comment docs.