ShifuML / shifu

An end-to-end machine learning and data mining framework on Hadoop
https://github.com/ShifuML/shifu/wiki
Apache License 2.0
251 stars 109 forks source link

NaN fall into the last bin #717

Closed m4rkl1u closed 4 years ago

m4rkl1u commented 4 years ago

In this line to get the bin index https://github.com/ShifuML/shifu/blob/master/src/main/java/ml/shifu/shifu/core/Normalizer.java#L640

If the value is "NaN", it will give the last bin number https://github.com/ShifuML/shifu/blob/master/src/main/java/ml/shifu/shifu/util/BinUtils.java#L189

 public void test2() {
    List<Double> bins = Arrays.asList(Double.valueOf("-Infinity"),
    1.3540577379901267E-4, 3.4140840073655133E-4, 8.388314917856846E-4, 0.0015892474745600124, 0.003504418399376486, 0.013004966254920104);

    Double val = Double.valueOf("NaN");

    System.out.println(getBinIndex(bins, val));
  }

This suppose to be the last bin number

zhangpengshan commented 4 years ago

Good catch, Mark. WIP.

zhangpengshan commented 4 years ago

Done in develop branch.