uber / petastorm

Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
Apache License 2.0
1.78k stars 285 forks source link

Allow opening parquet stores with unsupported types. #686

Closed selitvin closed 3 years ago

selitvin commented 3 years ago

Previously, an error was raised when an unsupported type (e.g. nested structure) was found in the datasource. Relaxing the constraint so that fields with unknown types would be silently ignored - simply won't show up in the loaded data.

codecov[bot] commented 3 years ago

Codecov Report

Merging #686 (920eb93) into master (7cda582) will not change coverage. The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #686   +/-   ##
=======================================
  Coverage   85.89%   85.89%           
=======================================
  Files          84       84           
  Lines        4956     4956           
  Branches      788      788           
=======================================
  Hits         4257     4257           
  Misses        560      560           
  Partials      139      139           
Impacted Files Coverage Δ
petastorm/unischema.py 93.24% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 7cda582...920eb93. Read the comment docs.