h2oai / steam

DEPRECATED Build, manage and deploy H2O's high-speed machine learning models.
http://www.h2o.ai/download/
GNU Affero General Public License v3.0
61 stars 30 forks source link

No way of changing convertUnknownCategoricalLevelsToNa using prediction-service-builder #377

Open ksbg opened 7 years ago

ksbg commented 7 years ago

When manually loading a POJO model using Java, I can simply set convertUnknownCategoricalLevelsToNa to True when loading it, like:

EasyPredictModelWrapper model = new EasyPredictModelWrapper(
    new EasyPredictModelWrapper.Config()
        .setModel(rawModel)
        .setConvertUnknownCategoricalLevelsToNa(true));

However, as I understand it, this is not possible when using Steam's prediction service builder. There are no options or parameters that would allow this while building the .war file. So it's impossible to make predictions using unknown categorical labels when using the service builder.

mstensmo commented 7 years ago

Yes that is correct. However, this is open source so you can change the code to do what you want. See here: https://github.com/h2oai/steam/blob/master/prediction-service-builder/src/main/webapp/extra/src/ServletUtil-TEMPLATE.java#L73 and https://github.com/h2oai/steam/blob/master/prediction-service-builder/src/main/webapp/extra/src/ServletUtil-TEMPLATE.java#L91

Then once changed you just restart the Prediction Service Builder and create your new services, which then will do what you want them to do.

ksbg commented 7 years ago

Yes, I've already done that, however this seems like an important functionality, so I feel like adding this functionality (e.g. to be activated using an extra parameter when using CLI to build the service, or a tick-box when using the web UI) would make sense. If people agree with me, I'm willing to add it.

mstensmo commented 7 years ago

The problem is that some people want the default functionality, with an error for unknown categorical values -- because then people sent the wrong data in. In other cases, like you, you want it to be NA. So to add it you need to send an additional parameter like you say, which makes things more complicated.