uber / petastorm

Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
Apache License 2.0
1.78k stars 285 forks source link

Remove very old pickle compatibility code modifying old atg package names #702

Open selitvin opened 3 years ago

selitvin commented 3 years ago

This is code cleanup. The code was meant to support opening some old datasets that contained pickled data with pre-petastorm names.

Can not specify selitvin:selitvin/remove_legacy as an upstream, so github shows two commits in this PR. Please review only the second commit as the first is part of #640

codecov[bot] commented 3 years ago

Codecov Report

Merging #702 (e54d908) into master (d32709d) will increase coverage by 0.00%. The diff coverage is 86.66%.

@@           Coverage Diff           @@
##           master     #702   +/-   ##
=======================================
  Coverage   86.27%   86.27%           
=======================================
  Files          85       85           
  Lines        5084     5070   -14     
  Branches      787      784    -3     
=======================================
- Hits         4386     4374   -12     
+ Misses        559      558    -1     
+ Partials      139      138    -1     
Impacted Files Coverage Δ
petastorm/etl/safe_pickle.py 81.81% <81.81%> (ø)
petastorm/etl/dataset_metadata.py 87.33% <100.00%> (ø)
petastorm/etl/rowgroup_indexing.py 64.51% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update d32709d...e54d908. Read the comment docs.

CLAassistant commented 1 year ago

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

:white_check_mark: selitvin
:x: Yevgeni Litvin


Yevgeni Litvin seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.