uber / petastorm

Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
Apache License 2.0
1.78k stars 285 forks source link

Support on-premise s3-compatible storage. #625

Closed acmore closed 3 years ago

acmore commented 3 years ago

Support on-premise s3-compatible storage. User can now pass an instance of pyarrow filesystem to make_reader and make_batch_reader functions.

CLAassistant commented 3 years ago

CLA assistant check
All committers have signed the CLA.

selitvin commented 3 years ago

For the PR to land, please:

codecov[bot] commented 3 years ago

Codecov Report

Merging #625 (713eb1a) into master (7428f3f) will not change coverage. The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #625   +/-   ##
=======================================
  Coverage   85.32%   85.32%           
=======================================
  Files          85       85           
  Lines        4933     4933           
  Branches      783      783           
=======================================
  Hits         4209     4209           
  Misses        584      584           
  Partials      140      140           
Impacted Files Coverage Δ
petastorm/etl/dataset_metadata.py 87.33% <100.00%> (ø)
petastorm/fs_utils.py 91.75% <100.00%> (ø)
petastorm/reader.py 89.32% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 7428f3f...713eb1a. Read the comment docs.

acmore commented 3 years ago

For the PR to land, please:

  • rebase
  • fix linter issue/s
  • make sure tests pass

Sure. All the tests passed.

selitvin commented 3 years ago

Can you please populate the PR comment?

acmore commented 3 years ago

Can you please populate the PR comment?

Oh sorry. I didn't notice about it. Resolved.

selitvin commented 3 years ago

Excellent. Thank you for your PR. I'll cut a release within couple of days.