Closed abditag2 closed 4 years ago
Merging #566 into master will decrease coverage by
3.01%
. The diff coverage is18.75%
.
@@ Coverage Diff @@
## master #566 +/- ##
==========================================
- Coverage 85.78% 82.77% -3.02%
==========================================
Files 87 87
Lines 4968 4969 +1
Branches 792 793 +1
==========================================
- Hits 4262 4113 -149
- Misses 572 723 +151
+ Partials 134 133 -1
Impacted Files | Coverage Δ | |
---|---|---|
...ark_dataset_converter/pytorch_converter_example.py | 0.00% <0.00%> (-94.00%) |
:arrow_down: |
..._dataset_converter/tensorflow_converter_example.py | 0.00% <0.00%> (-85.11%) |
:arrow_down: |
examples/spark_dataset_converter/utils.py | 31.25% <25.00%> (-68.75%) |
:arrow_down: |
..._dataset_converter/tests/test_converter_example.py | 58.33% <50.00%> (-41.67%) |
:arrow_down: |
petastorm/spark/spark_dataset_converter.py | 90.26% <0.00%> (-1.50%) |
:arrow_down: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 1ed3f60...1299d24. Read the comment docs.
Hi @abditag2 - thanks for your contribution here! However, I would prefer that we do not bake data into the main container. If downloading MNIST data is an inconvenience, can we find an alternative here? How does storing it in the container fix this issue for you? If you need to keep it in a container, would you be willing to create a second
Dockerfile
that inherits from the main one and then use that?
@jsgoller1 The link to download the MNIST dataset is a university link and at times it is extremely slow i.e., 3kb/s slow. This diff bakes the data into the docker and for the tests, eliminates the data download requirements. When running the tests, they read the data from local folder rather than downloading every time.
However, I have updated the tests to use the download if the data is not present in the docker image. So, it does not overwrite previous behavior.
Please look at my other diff. We built a docker image using this branch and I am using this docker in the other diff to unblock tests.
The link to download mnist data is very slow and tests time out. this diff bakes the mnist data into the docker image.