ShifuML / shifu

An end-to-end machine learning and data mining framework on Hadoop
https://github.com/ShifuML/shifu/wiki
Apache License 2.0
252 stars 109 forks source link

Add hash seed feature for categorical column #630

Closed zhang7575 closed 5 years ago

zhang7575 commented 5 years ago
  1. User able to specify HASH seed for categorical feature in "categoricalHashSeedConfFile" section in "dataSet" section of ModelConfig.json file, if the file is not specified, by default columns/categorical.hash.seed.conf will be used for hash setting
  2. For columns using hash seed, variable value will be converted to originalValue.hashcode() mod HashSeed for normalization.
zhang7575 commented 5 years ago

fix the format issue and remove redundant loggers