ShifuML / shifu

An end-to-end machine learning and data mining framework on Hadoop
https://github.com/ShifuML/shifu/wiki
Apache License 2.0
249 stars 109 forks source link

Enhance Norm Compact #736

Closed Liu-Delin closed 3 years ago

Liu-Delin commented 3 years ago

Description

This change will only impact shifu norm -Dshifu.norm.only.selected=true. It will not impact shifu train and shifu norm.

1. Fix an issue of empty column.

When we only norm selected column, we have 3 parts of output: target, selected columns and weight. But the it has an empty column between selected columns and weight. For example, if we have 3 selected columns, the header is:

target|column1|column2|column3|weight

But the data becomes:

1|0.5|-0.4|0.1||1.0

It has an extra | betwen column3 and weight. I fixed it by line 415-422 and line 444.