Closed liangz1 closed 4 years ago
Merging #505 into master will not change coverage by
%
. The diff coverage isn/a
.
@@ Coverage Diff @@
## master #505 +/- ##
=======================================
Coverage 86.17% 86.17%
=======================================
Files 81 81
Lines 4421 4421
Branches 704 704
=======================================
Hits 3810 3810
Misses 502 502
Partials 109 109
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 31558f2...31558f2. Read the comment docs.
To mock make_batch_reader
for test, the following simple code should work:
from contextlib import contextmanager
@contextmanager
def mock_make_batch_reader():
captured_args = []
import petastorm
original_make_batch_reader = petastorm.make_batch_reader
def mock_fn(dataset_url, **kwargs):
captured_args.append({'dataset_url': dataset_url, **kwargs})
return original_make_batch_reader(dataset_url, **kwargs)
petastorm.make_batch_reader = mock_fn
try:
yield captured_args
finally:
petastorm.make_batch_reader = original_make_batch_reader
with mock_make_batch_reader() as captured_args:
from petastorm import make_batch_reader
with make_batch_reader('file:///tmp/t0001', workers_count=18) as reader:
for i in reader:
print(i)
print('get captured args: ' + str(captured_args))
Let's wait this PR https://github.com/uber/petastorm/pull/506 merge first, and then reuse some code there.
@liangz1 @mengxr @selitvin do we know if this PR can leverage the performance boost being offered in #492? If yes, it might be a nice idea to get that merged too, given all tests pass.
Also minor nit in the PR title, shouldn't it be?
Simplify data conversion from Spark to PyTorch DataLoader
Looks Good!
What changes are proposed in this PR?
Add
converter.make_torch_dataloader()
with advanced params.The latest API
Example Code (PyTorch)