combust / mleap

MLeap: Deploy ML Pipelines to Production
https://combust.github.io/mleap-docs/
Apache License 2.0
1.5k stars 310 forks source link

Loading Mleap models from external file systems #595

Open TotalForgot opened 4 years ago

TotalForgot commented 4 years ago

Dear experts,

We noticed that MLeap-serving only support loading models, if and only if, the model is accessible in local file system to MLeap-serving server. This strongly limits the usage of MLeap server, especially in distributed environment. We just wonder if this is still true for the newest version of MLeap? And if it NO, how is it possible to use standard distributed file systems to store and load MLeap models, such as HDFS?

For instance: '{"modelName":"mymodel1","uri":"hdfs://hdfsnamespace/models/model1.zip","config":{"memoryTimeout":900000,"diskTimeout":900000},"force":false}'

Best Tien Dat

FredYao commented 4 years ago

@TotalForgot , MLeap now supports loading from HDFS. Look this PR (https://github.com/combust/mleap/pull/415) for more details. Also, here is a simple example showing how: https://github.com/combust/mleap/issues/563#issuecomment-573960259.

ancasarb commented 4 years ago

To get this to work, you might need to add https://github.com/combust/mleap/tree/master/bundle-hdfs as a dependency for the serving project (spring boot for example) and add an implementation similar to the one for S3 https://github.com/combust/mleap/blob/master/mleap-repository-s3/src/main/scala/ml/combust/mleap/repository/s3/S3Repository.scala. Hope this helps!

Let me know if you'd be interested in adding this in, thank you!