autodeployai / pmml4s

PMML scoring library for Scala
https://www.pmml4s.org/
Apache License 2.0
58 stars 9 forks source link

Scoring results "Like there is no outputs" #6

Closed asmirnov-tba closed 4 years ago

asmirnov-tba commented 4 years ago

Hi, Autodeployai team!

I understand the idea of "if your PMML file has "output fields" then prediction results - only these fields, if no, we output all we can". But is there any way to load the PMML file, ignoring the "Output" fields? In my project, I need to output all possible scoring results regardless of the existence of the "Output section" in the PMML file.

I'm looking for something like: Model fromInputStream(InputStream is, Boolean ignoreOutput)

Manual PMML file editing is totally not an option, it breaks all the automation and completely ruin the good user experience.

If there is no such possibility right now, in my opinion, this will be a HUGE benefit for all your users.

And, by the way, thank you very much, you do awesome product, I'm really enjoying using it!

scorebot commented 4 years ago

@asmirnov-tba This issue is not a new one, there are already many discussions about it. There is a not simple task to support the function perfectly because the output element allows for post-processing of output fields, we can not use all possible results directly instead of present output fields when ignoreOutput is true, the result could be wrong because of extra processions will be not involved if there are post transformations in the output fields.

A feasible way is adding those missing results into the existing Output, while the dependency of all output fields needs to be considered carefully.

Please, let me know if you have any comments. We need more discussions to consider this function.

asmirnov-tba commented 4 years ago

@scorebot actually the way you offer maybe even better because the general goal for me - not to ignore "output fields", but also got access to postprocessed fields. Basically got access to scoring results for fields listed in model.targetFields()

scorebot commented 4 years ago

@asmirnov-tba Thanks for your proposal and comments, we will put the function as an enhancement into the next dev plan.

asmirnov-tba commented 4 years ago

Cool, thank you. Have something like "model.predictTargets()" would be super useful. Do you have any kind of public Jira or something like that? Or kind of public roadmap? It would be very interesting to know, what will come in next versions.

scorebot commented 4 years ago

Based on the latest PMML 4.4, there are no special new features need to implement, currently, there are just maintaining tasks:

  1. Make predictions correctly based on the PMML standard, the main task is to fix bugs reported by users.
  2. Performance improvements. Especially for the ensemble models that exported by XGBoost or LightGBM.

The next major version is mainly for the new release of PMML (maybe 5.0) that could support the deep neural network model, a draft version has been made by Nyoka.

scorebot commented 4 years ago

@asmirnov-tba The requested changes are done. There are two ways to change the default outputs:

  1. Set the flag supplementOutput when the output element is defined in PMML. When it's true, besides the defined output, those candidate outputs also will be produced if they are not in the defined list, the default value of this flag is false.

  2. Set the output fields completely by the method setOutputFields, whatever if there is an output element in PMML.

Please, refer to the unit test custom model outputs for details. Let me know if you have any comments.

BTW, you need to build the latest code by yourself, it's very easy by sbt.

scorebot commented 4 years ago

@asmirnov-tba Did the latest code resolve your issues?

scorebot commented 4 years ago

@asmirnov-tba I close this issue now. if you have other problems, please feel free to open a new one.