Qihoo360 / tensornet

Apache License 2.0
315 stars 72 forks source link

what does the <bucket_size> param used for in feature_column.py? #46

Open pengyuan opened 3 years ago

pengyuan commented 3 years ago

https://github.com/Qihoo360/tensornet/blob/master/tensornet/feature_column/category_column.py what does the param used for in feature_column.py?

i cannot find any hash function or other usage...

  1. category_column only accept slot/int/ID features
  2. or support string categorical feature to be hashed by bucket_size?
zhangys-lucky commented 3 years ago

please convert all features into uint64 first, you can choose hash function like murmurhash do that.