Closed xiaoyaoyang closed 2 years ago
You need to check the dtype
of the input data fields.
Example:
From
input_data = {
'user_id': np.array([8]),
'user_name': np.array(["Someone"]),
}
To
input_data = {
'user_id': np.array([8], dtype=np.int32),
'user_name': np.array(["Someone"]),
}
It depends on what your model is expecting to get.
@almirb Thanks for the reply! II tried with one record (I don't know how to pick one record... thus this code, please let me know if there is a better way :))) )
for row in train_map.batch(1).take(1):
row
print(row)
Just want to clarify, in my case, brute_force(row)
works, but loaded(row)
will throw errors... between brute_force
and loaded
, I first make sure I call brute_force
once, and simply copy-paste code from this tutorial
It also works if I DO NOT Specify the Query model when saving Brute Force. so it would be
loaded(my_query_model(row))
feels like it is fine to just save transformation starting from embedding.. (my_query_model(row) will return embedding I think), but if I store the query model which creates the embedding into the brute_force
, it will give me errors..
@xiaoyaoyang When serialising a model, tensorflow creates a strict function call signature based on the tracing the model. Before serialising you have passed in a dict containing.
{'PRICE': TensorSpec(shape=(None,), dtype=tf.float32, name='PRICE'), 'USER_ID': TensorSpec(shape=(None,), dtype=tf.string, name='USER_ID'), 'TRANS_COUNT': TensorSpec(shape=(None,), dtype=tf.int64, name='TRANS_COUNT'), 'SKU_KEY': TensorSpec(shape=(None, 1), dtype=tf.string, name='SKU_KEY'), 'SKU_DESC': TensorSpec(shape=(None,), dtype=tf.string, name='SKU_DESC'), 'CONTEXT_ID': TensorSpec(shape=(None, 5), dtype=tf.string, name='CONTEXT_ID')}
When you serialise your model tensorflow will create a call signature that expects all those inputs, even if your model doesn't use them. So when you call it with just USER_ID
, it will fail.
It's best to ensure you only pass the required features into your model during training and evaluation. Alternatively you should be able to resolve this by calling the model once with an example record with only the required features before serialising. This will then result in another call signature that matches the input you expect to pass when serving.
As always, @patrickorlando has the right answer. The key here is passing only the features you need (here, only the user features) into your model, not all features.
I'm going to close this, but please re-open if this doesn't solve the issue for you.
@patrickorlando Thanks! I will give it a try...
(The inputs of my query_model
and candidate_model
are indeed different, and the input
I feed into those two models contain all information (query model's features + candidate model's features).. )
Hello @patrickorlando, can you also how to change tensor variable shape from (1,0) in the dict input to (None,) to meet the requirement of saved model's input? Thanks!
Hey @cory1219,
You don't need to change the shape. If you pass a tensor with shape (n, m)
to your model, the call signature will accept shape (None, m)
.
Hi @patrickorlando , thanks for your reply!! But I wonder if my input of the loaded model is like this:
input_data = { 'user_id': np.array([['6']]), 'product_name': np.array([["Apple"]]), }
Why can't model accept the shape (1,1) of each feature? Isn't it equal to the shape (None, 1) that the call signature accepts?
Thanks!!
Hey @cory1219, It should work. Are you getting an error for the example above? Please post the error message here if you have one and it should be easier to give a specific answer.
Hi @patrickorlando
I have attached the error message as below. Can I change the format of input that the model can accept? Thanks!
ValueError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_23596/1485082039.py in
~\AppData\Local\Temp/ipykernel_23596/1485082039.py in save_ranking(model) 52 # Pass a customer id in, get top predicted product back. 53 print( ---> 54 loaded({ 55 "customer_id": tf.constant(np.array([["0001019648"]])), 56 "customer_price_group": tf.constant(np.array([["KK-nicht verwenden"]])),
~\Anaconda3\lib\site-packages\tensorflow\python\saved_model\load.py in _call_attribute(instance, *args, kwargs) 699 700 def _call_attribute(instance, *args, *kwargs): --> 701 return instance.call(args, kwargs) 702 703
~\Anaconda3\lib\site-packages\tensorflow\python\util\traceback_utils.py in error_handler(*args, **kwargs) 151 except Exception as e: 152 filtered_tb = _process_traceback_frames(e.traceback) --> 153 raise e.with_traceback(filtered_tb) from None 154 finally: 155 del filtered_tb
~\Anaconda3\lib\site-packages\tensorflow\python\saved_model\function_deserialization.py in restored_function_body(*args, **kwargs) 287 "Option {}:\n {}\n Keyword arguments: {}" 288 .format(index + 1, _pretty_format_positional(positional), keyword)) --> 289 raise ValueError( 290 "Could not find matching concrete function to call loaded from the " 291 f"SavedModel. Got:\n {_pretty_format_positional(args)}\n Keyword "
ValueError: Could not find matching concrete function to call loaded from the SavedModel. Got: Positional arguments (2 total):
False Keyword arguments: {}
Expected these arguments to match one of the following 4 option(s):
Option 1: Positional arguments (2 total):
Option 2: Positional arguments (2 total):
Option 3: Positional arguments (2 total):
Option 4: Positional arguments (2 total):
Hey @cory1219,
Looks like you have a rank mismatch. Your model expects tensors of shape (None,)
, which has rank 1, but you are passing in tensors of shape (1, 1)
which has rank 2. Your input should tensors should have shape (1,)
.
More concretely your model expects
{'product_id': ['abc'], ...}
and you are passing
{'product_id': [['abc']], ...}
Hope this helps.
Hi @patrickorlando,
Thank you so much for your help!! Now the prediction of the loaded ranking model finally works! But when I tried to predict using the loaded retrieval model, it showed a totally different error message. Can you also help me with that? Thank you!! My code:
def save_retrieval(model):
# Create a BruteForce layer as before for prediction
index = tfrs.layers.factorized_top_k.BruteForce(model.query_model)
#index = tfrs.layers.factorized_top_k.ScaNN(model.query_model)
index.index_from_dataset(
tf.data.Dataset.zip((items.batch(100), items.batch(100).map(model.candidate_model)))
)
# Export the query model.
with tempfile.TemporaryDirectory() as tmp:
path = os.path.join(tmp, "retrieval")
# Save the index.
tf.saved_model.save(
index,
path,
#options=tf.saved_model.SaveOptions(namespace_whitelist=["Scann"]),
)
# Load it back; can also be done in TensorFlow Serving.
loaded = tf.saved_model.load(path)
#print(type(loaded))
time = datetime.datetime.strptime("2021", '%Y')
time_input = datetime.datetime.timestamp(time)
# Pass a customer id in, get top predicted product id back.
scores, titles = loaded({
"customer_id": np.array(["0001019648"]),
"customer_price_group": np.array(["KK-nicht verwenden"]),
"customer_type": np.array(["END-ACCOUNT"]),
"customer_industry": np.array(["Power Utilities"]),
"companyname_gu": np.array(["Zweckverband kommunaler Anteilseigner der WEMAG"]),
"project_flag": np.array([0]),
"timestamp": np.array([time_input]),
}).numpy()
print(f"Recommendations: {titles[0][:3]}")
save_retrieval(model)
The error message shows as follows:
TypeError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_23596/1368832069.py in
~\AppData\Local\Temp/ipykernel_23596/1368832069.py in save_retrieval(model) 25 26 # Pass a customer id in, get top predicted product id back. ---> 27 scores, titles = loaded({ 28 "customer_id": np.array(["0001019648"]), 29 "customer_price_group": np.array(["KK-nicht verwenden"]),
TypeError: '_UserObject' object is not callable
Glad to help @cory1219.
Before you can serialise the model, you need to build it. This happens automatically when you call the model with data. You just need to call your brute force index with an example record before you save it.
Thanks for your reply @patrickorlando
But I would like to confirm your statement "call your brute force index with an example record before you save it." It is equal to my implementation of the aforementioned code line? My code: index.index_from_dataset( tf.data.Dataset.zip((items.batch(100), items.batch(100).map(model.candidate_model))) )
I mean getting a prediction when I say call the model. After you index the brute_force layer, you need to run.
scores, identifiers = index(example_record)
About Save & Load BruteForce/Scann and Model object
I am playing this tutorial with a online shopping dataset, and followed the tutorial where the User tower is similar to this:
Where inputs is a
dict
.I am able to get the embedding by UserModel()(input_dict) just fine. The issue is when I work with the example in this link: https://www.tensorflow.org/recommenders/examples/efficient_serving, where we want to save the Scann/BF object.
I am able to get the Scann working and able to call it
and it will return meaningful results. However, if I follow
Deploying the approximate model
section to save it and load it back, I got an errorSeems to me 1, the shape is all (None,) 2, it can not identify the input dict anymore.. same thing happened if I tried to save the model and load it back
It would throw similar errors but the
model(row)
(row is a dict) works fine..Can't do model.evaluate after replacing factorized_metrics with BruteForce
Another strange finding is, with the above setup, if I define the Query model in the BruteForce (
brute_force = tfrs.layers.factorized_top_k.BruteForce(model.user_model)
), and then reset factorized_metrics and then do the model.evaluate (for fast performance), it will give me an errorseems it does not like the way how I let User Tower's input as a dict and return
self.user_model(inputs['USER_ID'])
. However, It will work if I do not specify the User_Model when initializing the BruteForce function.Any insights would be appreciated!