SeldonIO / seldon-server

Machine Learning Platform and Recommendation Engine built on Kubernetes
https://www.seldon.io/
Apache License 2.0
1.47k stars 300 forks source link

No dimensions for int and boolean attribute type #45

Closed Rocknpools closed 4 years ago

Rocknpools commented 7 years ago

This is from the documentation at http://docs.seldon.io/api-oauth.html#actions

The item attribute definition is: string name [attr_id 1] string artist [attr_id 2] enum category [attr_id 3] double price [attr_id 4] Where: category is the enumeration (pop [value_id 1], rock [value_id 2], rap [value_id 3]) a range definition is created for the price (<10 [value_id 1], 10-20 [value_id 2], >20 [value_id 3]) We’ll have the following dimension definition: dimension1 [dim_id 1, attr_id 3, value_id 1] (category = pop) dimension2 [dim_id 2, attr_id 3, value_id 2] (category = rock) dimension3 [dim_id 3, attr_id 3, value_id 3] (category = rap) dimension4 [dim_id 4, attr_id 4, value_id 1] (category =<10) dimension5 [dim_id 5, attr_id 4, value_id 2] (category = 10-20) dimension6 [dim_id 6, attr_id 4, value_id 3] (category => 20)

Now the question is: when i define the attribute seldon automatically create dimensions only for the enum value.

What about the boolean? What about the int? how should i define the range of an int attribute?

ukclivecox commented 7 years ago

Dimensions are only created for enum features. If you want one for price you would need to create an enum to reflect the ranges, e.g. low, medium, high price or something more fine grained.

Rocknpools commented 7 years ago

and the same for the boolean attribute i guess....

PS. What about demographic? is this feature working right now???

ukclivecox commented 7 years ago

Yes. That would be an obvious improvement to create dimensions for boolean attributes but its not there at present.

Rocknpools commented 7 years ago

What about demographic? is this feature working right now???

ukclivecox commented 7 years ago

User meta data is available but most of the recommendation algorithms built in do not take it into account as they are purely collaborative filtering based.