Closed sanmitra closed 4 years ago
Thanks for the contribution! It looks like @ijeri is an internal user so signing the CLA is not required. However, we need to confirm this.
Merging #528 (7501ab5) into master (bc507f2) will increase coverage by
4.26%
. The diff coverage is100.00%
.
@@ Coverage Diff @@
## master #528 +/- ##
==========================================
+ Coverage 82.47% 86.74% +4.26%
==========================================
Files 347 347
Lines 11952 11956 +4
Branches 377 384 +7
==========================================
+ Hits 9858 10371 +513
+ Misses 2094 1585 -509
Impacted Files | Coverage Δ | |
---|---|---|
...cala/com/salesforce/op/OpWorkflowModelWriter.scala | 100.00% <100.00%> (ø) |
|
...sforce/op/stages/impl/feature/Transmogrifier.scala | 98.05% <0.00%> (+0.83%) |
:arrow_up: |
...esforce/op/features/types/FeatureTypeFactory.scala | 99.13% <0.00%> (+0.86%) |
:arrow_up: |
...la/com/salesforce/op/features/FeatureBuilder.scala | 35.17% <0.00%> (+1.37%) |
:arrow_up: |
...la/com/salesforce/op/stages/OpPipelineStages.scala | 63.88% <0.00%> (+1.38%) |
:arrow_up: |
...rce/op/stages/impl/preparators/SanityChecker.scala | 90.57% <0.00%> (+1.63%) |
:arrow_up: |
...scala/com/salesforce/op/testkit/RandomStream.scala | 100.00% <0.00%> (+2.38%) |
:arrow_up: |
...op/stages/DefaultOpPipelineStageReaderWriter.scala | 69.23% <0.00%> (+2.56%) |
:arrow_up: |
.../op/stages/impl/feature/PercentileCalibrator.scala | 96.87% <0.00%> (+3.12%) |
:arrow_up: |
.../scala/com/salesforce/op/dsl/RichTextFeature.scala | 82.19% <0.00%> (+4.10%) |
:arrow_up: |
... and 47 more |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update bc507f2...f0db8d9. Read the comment docs.
@ijeri can you give an example of the local path? Do you mean because the path is not absolute?
Saving RDD to a local path should work even if the local filesystem is not the default one:
scala> sc.hadoopConfiguration.set("fs.defaultFS", "hdfs://127.0.0.1/")
scala> sc.parallelize(Seq(1,2,3)).saveAsTextFile("file:///tmp/test5")
+1 to stop gzipping op-model json since we now zip the whole bundle
@gerashegalov the file path is a qualified one for ex- file:/a.../b..../op-model.json
makes sense: after rdd is distributed partitions are written out to the worker node local dirs.
@gerashegalov I have addressed your comments. Please take a look again.
Thanks for the contribution! Unfortunately we can't verify the commit author(s): Sanmitra Ijeri s***@s***.com. One possible solution is to add that email to your GitHub account. Alternatively you can change your commits to another email and force push the change. After getting your commits associated with your GitHub account, refresh the status of this Pull Request.
Related issues The model loading in a cluster was failing with error - op-model.json/part-00000 did not exist.
Additional context In #516 we made the changes to save model locally and convert to zip before moving to remote path. So the path passed to the below method
saveImpl
is of a local file system, hence when we are trying to save it as a HadoopFile, op-model.json/part-00000 is not being written out.