MleapSpringBoot - Multi-input StringIndexer not supported yet

combust / mleap

MLeap: Deploy ML Pipelines to Production

Apache License 2.0

1.5k stars 312 forks source link

To whom it may concern,

I'm trying to deploy an PySpark pipeline using the MLeap bundle with combustml/mleap-spring-boot:0.19.0-SNAPSHOT docker image. And I get this error:

[MleapSpringBoot-akka.actor.default-dispatcher-6] [akka://MleapSpringBoot/user/transform/model] 
Cannot load bundle because: java.lang.UnsupportedOperationException: Multi-input StringIndexer not supported yet.

Any insights how can I fix it?

The bundle has the following structure

model
├── bundle.json
└── root
    ├── RandomForestClassifier_e24b4862ceb2.node
    │   ├── model.json
    │   ├── node.json
    │   ├── tree0
    | .......
    │   └── tree9
    │       ├── model.json
    │       └── tree.json
    ├── StandardScaler_a24a7bb9bb7b.node
    │   ├── model.json
    │   └── node.json
    ├── StringIndexer_07ad6a29446e.node
    │   ├── model.json
    │   └── node.json
    ├── StringIndexer_397d06fcffaa.node
    │   ├── model.json
    │   └── node.json
    ├── VectorAssembler_56af20ae6ed6.node
    │   ├── model.json
    │   └── node.json
    ├── VectorAssembler_c118350511db.node
    │   ├── model.json
    │   └── node.json
    ├── model.json
    └── node.json

and it was trained using ml.combust.mleap:mleap-runtime_2.12:0.18.1 and ml.combust.mleap:mleap-spark_2.12:0.18.1 with spark version: 3.1.2.

Thanks

indexer1 = StringIndexer(inputCol="foo", outputCol="a') indexer2 = StringIndexer(inputCol="bar", outputCol="b") indexer3 = StringIndexer(inputCol="baz", outputCol="c") pipe = Pipeline(stages=[...,indexer1, indexer2, indexer3, ...])

combust / mleap

MleapSpringBoot - Multi-input StringIndexer not supported yet #784