SeldonIO / seldon-server

Machine Learning Platform and Recommendation Engine built on Kubernetes
https://www.seldon.io/
Apache License 2.0
1.47k stars 300 forks source link

Array of attributes for attributes #44

Closed rudf0rd closed 7 years ago

rudf0rd commented 7 years ago

Reading through the docs, I don't see a way in the meta data structuring to add an attribute with an array data type. Here's my issue:

I want to add ~10 actors per movie and ~15 actors per tv show. I originally thought about denormalizing this into actor_1, actor_2, etc. But that won't work relating separate attributes unless an actor happens to be in actor_1 multiple times, right?

In a perfect world, I'd be able to something like:

{
  [...],
  "type_attrs": [
    {"name": "actors", "value_type": ["int"]}  //<-- array of ints
  ]
}

Is there anything I could do to get close to something like this? Is there something else I'm missing that would work instead?

The same thing happens with genres since a movie or tv show usually has multiple genres (ex: thriller, action vs. thriller, horror).

ukclivecox commented 7 years ago

Yes at present the closest would be to have separate meta-data fields actor1..actorN. For genres probably a boolean field for each.

Most of the current algorithms for recommendation built in are for collaborative filtering where the activity is the core data used to create a Model. For algorithms where you want to add extra context meta-data along side meta data I would look at Factorization Machines https://www.slideshare.net/hongliangjie1/libfm

We don't presently have a FM based model unfortunately.