ShifuML / shifu

An end-to-end machine learning and data mining framework on Hadoop
https://github.com/ShifuML/shifu/wiki
Apache License 2.0
251 stars 109 forks source link

Duplicate bin boundary #689

Open wuhaifengdhu opened 4 years ago

wuhaifengdhu commented 4 years ago

Double 3.0 for this case

"columnNum" : 6078, "columnName" : "s_rcvd_rto_30_90", "version" : "0.13.0", "columnType" : "N", "columnFlag" : "Candidate", "finalSelect" : true, "hybridThreshold" : 4.9E-324, "columnStats" : { "max" : 3.0, "min" : 0.0, "mean" : 0.969593, "median" : 0.0, "totalCount" : 4427576, "distinctCount" : 231200, "missingCount" : 3682964, "validNumCount" : 744612, "stdDev" : 1.17932, "missingPercentage" : 0.831824004827924, "woe" : 5.486157256219863, "ks" : 1.930447, "iv" : 0.0104, "weightedKs" : 1.4138639062800982, "weightedIv" : 0.04148687102742978, "weightedWoe" : 4.560755748299773, "skewness" : 0.7936388647469407, "kurtosis" : 1.9988262682536753, "psi" : 0.0524373837831184, "unitStats" : null, "25th" : 0.0, "75th" : 0.0 }, "columnBinning" : { "length" : 17, "binBoundary" : [ "-Infinity", 0.0031792935934526665, 0.23121243910833428, 0.48411610233937674, 0.5878110649105356, 0.7433764029558342, 0.9075731041710438, 1.039727340908101, 1.321298913385911, 1.5104044807446948, 1.697140781975826, 2.0739407436641955, 2.464629881019581, 2.9924550691028093, 3.0, 3.0 ],