salesforce / TransmogrifAI

TransmogrifAI (pronounced trăns-mŏgˈrə-fī) is an AutoML library for building modular, reusable, strongly typed machine learning workflows on Apache Spark with minimal hand-tuning
https://transmogrif.ai
BSD 3-Clause "New" or "Revised" License
2.24k stars 392 forks source link

Save model locally and convert to zip before moving to final path and do the reverse for loading #516

Closed leahmcguire closed 3 years ago

leahmcguire commented 4 years ago

Related issues The mleap bundle save works only with the local file system so workflow saves directly to hdfs fail after merging #475 https://github.com/salesforce/TransmogrifAI/issues/514

Describe the proposed solution Save models first to a local tmp file and then zip and move to final location. move zipped file to local and unzip for loading

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Additional context Add any other context about the changes here.

codecov[bot] commented 4 years ago

Codecov Report

Merging #516 into master will decrease coverage by 0.02%. The diff coverage is 72.72%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #516      +/-   ##
==========================================
- Coverage   86.72%   86.70%   -0.03%     
==========================================
  Files         347      347              
  Lines       11897    11883      -14     
  Branches      381      379       -2     
==========================================
- Hits        10318    10303      -15     
- Misses       1579     1580       +1     
Impacted Files Coverage Δ
...c/main/scala/com/salesforce/op/ModelInsights.scala 90.12% <0.00%> (-1.76%) :arrow_down:
...op/evaluators/OpMultiClassificationEvaluator.scala 94.84% <ø> (ø)
.../src/main/scala/com/salesforce/op/OpWorkflow.scala 88.19% <100.00%> (ø)
...main/scala/com/salesforce/op/OpWorkflowModel.scala 93.90% <100.00%> (ø)
...cala/com/salesforce/op/OpWorkflowModelReader.scala 92.52% <100.00%> (+1.31%) :arrow_up:
...cala/com/salesforce/op/OpWorkflowModelWriter.scala 100.00% <100.00%> (ø)
...m/salesforce/op/utils/stages/NameDetectUtils.scala 88.05% <0.00%> (-1.40%) :arrow_down:
...e/op/stages/impl/feature/SmartTextVectorizer.scala 95.20% <0.00%> (-0.13%) :arrow_down:
...s/impl/feature/OPCollectionHashingVectorizer.scala 96.50% <0.00%> (-0.05%) :arrow_down:
...p/stages/impl/feature/SmartTextMapVectorizer.scala 100.00% <0.00%> (ø)
... and 54 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 09200c8...00f3e8d. Read the comment docs.

leahmcguire commented 3 years ago

https://github.com/combust/mleap/issues/721

leahmcguire commented 3 years ago

You want the link in the code?

tovbinm commented 3 years ago

@leahmcguire It's fine to have it here.

salesforce-cla[bot] commented 3 years ago

Thanks for the contribution! It looks like @leahmcguire is an internal user so signing the CLA is not required. However, we need to confirm this.

salesforce-cla[bot] commented 3 years ago

Thanks for the contribution! Unfortunately we can't verify the commit author(s): leahmcguire l***@s***.com Leah McGuire l***@s***.com. One possible solution is to add that email to your GitHub account. Alternatively you can change your commits to another email and force push the change. After getting your commits associated with your GitHub account, refresh the status of this Pull Request.