uber / petastorm

Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
Apache License 2.0
1.78k stars 285 forks source link

Fix Fix: The 'median size too small' warning is too frequent #538 #541

Closed liangz1 closed 4 years ago

liangz1 commented 4 years ago

Removing the reverse=True to actually select the larger size when there is a tie.

codecov[bot] commented 4 years ago

Codecov Report

Merging #541 into master will not change coverage. The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #541   +/-   ##
=======================================
  Coverage   86.53%   86.53%           
=======================================
  Files          85       85           
  Lines        4717     4717           
  Branches      743      743           
=======================================
  Hits         4082     4082           
  Misses        516      516           
  Partials      119      119           
Impacted Files Coverage Δ
petastorm/spark/spark_dataset_converter.py 92.50% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update c370bac...6bcb647. Read the comment docs.