tensorflow / recommenders

TensorFlow Recommenders is a library for building recommender system models using TensorFlow.
Apache License 2.0
1.83k stars 274 forks source link

loaded scann index can not use k param #298

Open liyingkun1237 opened 3 years ago

liyingkun1237 commented 3 years ago

https://github.com/tensorflow/recommenders/blob/main/docs/examples/basic_retrieval.ipynb

# Export the query model.
with tempfile.TemporaryDirectory() as tmp:
  path = os.path.join(tmp, "model")

  # Save the index.
  index.save(path)

  # Load it back; can also be done in TensorFlow Serving.
  loaded = tf.keras.models.load_model(path)

  # Pass a user id in, get top predicted movie titles back.
  scores, titles = loaded(["42"])

  print(f"Recommendations: {titles[0][:3]}")

loaded object have not k param,so i can not return k(default 10) than 10 movies?

maciejkula commented 3 years ago

The SavedModel format saves concrete TensorFlow functions - with a fixed k. If you'd like to serve multiple values of k, you should create multiple concrete signatures and save those.

For example:

recommendations_at_10 = tf.function(lambda user_id: model(user_id, k=10))
recommendations_at_100 = tf.function(lambda user_id: model(user_id, k=100))

# Call them to create concrete functions.
_ = recommendations_at_10(some_input)
_ = recommendations_at_100(some_input)

tf.saved_model.save(
  model,
  signatures={
    "k_10": recommendations_at_10.get_concrete_function(),
    "k_100": ...
   }
)
howardwang15 commented 3 years ago

How would you recommend we handle cases where at serving time, we want to dynamically return a certain number of objects? Saving a signature for each possible value of k seems to be an inefficient solution. Another solution that I can think of is adding another input to the model that can be used to control how many results out of the ones returned should be filtered...but that would require setting an upper limit on how many results are returned in the first place (eg something like all_results[:k])

alimirferdos commented 3 years ago

@maciejkula I have two issues concerning the answer you gave above.

First, although I call the functions before creating concrete functions, it still requires me to pass input specs. I got some results using the following snippet but I don't think it's the right way to do so:

recom_10 = tf.function(lambda user_id: model(user_id, k=10))
_ = recom_10(tf.constant([812]))

signatures = {"k_10": recom_10.get_concrete_function(user_id=tf.constant([812]))}

My second issue was creating a concrete function for query_with_exclusions. I've used the following code:

excluded_recoms = tf.function(lambda user_id, exclusions:
                              index.query_with_exclusions(user_id, k=10, exclusions=exclusions))
_ = excluded_recoms(uid=tf.constant([812]), exclusions=tf.constant([[150]]))

signatures = {"excluded_recoms": excluded_recoms.get_concrete_function(uid=tf.constant([812]), exclusions=tf.constant([[150]]))}

This way I can save and load the index but I can only pass a single exclusion id because its shape is set to (1, 1).

Any idea on how to solve these two issues?

maciejkula commented 3 years ago

Your concrete function spec implies that you want to exclude only one thing.

You can either create multiple functions for multiple shapes, or settle on a larger shape and pad smaller exclusions.

Edit: you could also consider passing something with a dynamic shape to exclusions when creating the concrete function, along the lines of concrete = recoms.get_concrete_function(uid=..., exclusions=tf.TensorSpec([1, None], dtype=tf.int32). I haven't tested this but it may work.

sam-watts commented 2 years ago

@maciejkula is there a reason why (other than it feeling like a bit of a hack..) you couldn't just define the argument k in topK.query_with_exclusions as a tf.Tensor and then just extract the value from that? Would that allow you to define it via TensorSpec to get a concrete function that allows for multiple values of k at inference time?

sam-watts commented 2 years ago

And just to confirm for future readers and maybe @alimirferdos - tf.TensorSpec works for the exclusions argument. I used something like this for model saving. In this case, you would need to pad all exclusions arrays to 10 in the 2nd dimension with out of vocabulary tokens.

  index.save(
      model_dir,
      signatures=index.query_with_exclusions.get_concrete_function(
          queries=tf.TensorSpec(shape=[None], dtype=tf.string),
          exclusions=tf.TensorSpec(shape=[None, 10], dtype=tf.string),
      ),
  )
jneeven commented 2 years ago

And just to confirm for future readers and maybe @alimirferdos - tf.TensorSpec works for the exclusions argument. I used something like this for model saving. In this case, you would need to pad all exclusions arrays to 10 in the 2nd dimension with out of vocabulary tokens.

Thanks for this! Is there really no better way than having to pad all exclusions?? In that case it's easier to just handle the exclusions myself...

jasonzyx commented 2 years ago

Edit: I figured it out myself.

For those who don't know how to call signatures like myself, here is the way

loaded.signatures["k_50"](tf.constant(["red dress"]))

Original question:

I was able to create the signatures, but unable to call them. Below is my code, anything wrong?

path = './model_300k_v2_training'

recommendations_at_50 = tf.function(lambda query: scann(query, k=50))

# Call them to create concrete functions.
some_input = tf.constant(["red dress"])
_ = recommendations_at_50(some_input)

tf.saved_model.save(
  scann,
  path,
  signatures={
      "k_50": recommendations_at_50.get_concrete_function(query=tf.TensorSpec(shape=(1,), dtype=tf.string)),
   },
  options=tf.saved_model.SaveOptions(namespace_whitelist=["Scann"]),
)

loaded = tf.saved_model.load(path)

# below code returns the error -- I wonder what is the correct way to call it?
_, titles = loaded(tf.constant(["red dress"]), k=50)

it gives me error

ValueError: Could not find matching function to call loaded from the SavedModel. Got:
  Positional arguments (3 total):
    * Tensor("queries:0", shape=(1,), dtype=string)
    * 50
    * False
  Keyword arguments: {}

Expected these arguments to match one of the following 4 option(s):

Option 1:
  Positional arguments (3 total):
    * TensorSpec(shape=(None,), dtype=tf.string, name='input_1')
    * None
    * False
  Keyword arguments: {}

Option 2:
  Positional arguments (3 total):
    * TensorSpec(shape=(None,), dtype=tf.string, name='queries')
    * None
    * False
  Keyword arguments: {}

Option 3:
  Positional arguments (3 total):
    * TensorSpec(shape=(None,), dtype=tf.string, name='queries')
    * None
    * True
  Keyword arguments: {}

Option 4:
  Positional arguments (3 total):
    * TensorSpec(shape=(None,), dtype=tf.string, name='input_1')
    * None
    * True
  Keyword arguments: {}