uber / petastorm

Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
Apache License 2.0
1.8k stars 284 forks source link

Verify that scalars and not arrays are passed to a ScalarCodec instance #498

Closed selitvin closed 4 years ago

selitvin commented 4 years ago

Validating types earlier in the code execution paths with a clear message is better than vague errors from downstream code.

codecov[bot] commented 4 years ago

Codecov Report

Merging #498 into master will increase coverage by 0.67%. The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #498      +/-   ##
==========================================
+ Coverage   85.77%   86.45%   +0.67%     
==========================================
  Files          79       81       +2     
  Lines        4190     4554     +364     
  Branches      665      754      +89     
==========================================
+ Hits         3594     3937     +343     
- Misses        494      509      +15     
- Partials      102      108       +6     
Impacted Files Coverage Δ
petastorm/codecs.py 77.77% <100.00%> (-1.17%) :arrow_down:
petastorm/unischema.py 94.58% <0.00%> (ø)
petastorm/spark/__init__.py 100.00% <0.00%> (ø)
petastorm/spark/spark_dataset_converter.py 93.27% <0.00%> (ø)
petastorm/etl/dataset_metadata.py 88.96% <0.00%> (+0.07%) :arrow_up:
petastorm/arrow_reader_worker.py 93.06% <0.00%> (+1.06%) :arrow_up:
petastorm/reader.py 92.38% <0.00%> (+1.55%) :arrow_up:
petastorm/fs_utils.py 92.00% <0.00%> (+3.26%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 6003a98...a97437d. Read the comment docs.