Open tillwf opened 3 years ago
I tried with the latest version of tensorflow-ranking (0.4.0) and it is still not working. Could someone help me ? Thank you
Here is a ncdu
of the model folder:
335.0 MiB [##########] /train
31.5 MiB [ ] saved_model.pb
3.1 MiB [ ] /validation
2.4 MiB [ ] /variables
792.0 KiB [ ] keras_metadata.pb
84.0 KiB [ ] /assets
I tried without the train
folder, but it didn't change the message.
Any clue ?
I followed the memory consumption of a script doing:
import tensorflow as tf
model = tf.saved_model.load("model_path")
(the model path does not contain the train
folder)
and we see this
Is there a way to reduce this memory usage ? The model weight is only 30MiB on disk and become 2GiB in memory.
I tried to reduce the size of the model by doing:
converter = tf.lite.TFLiteConverter.from_keras_model(ranker)
converter.optimizations = [tf.lite.Optimize.OPTIMIZE_FOR_SIZE]
converter.convert().save(model_dir, save_format='tf', signatures=signatures)
but I got this error:
2021-06-28 14:55:07.469989: W tensorflow/python/util/util.cc:348] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
W0628 14:55:07.577553 140086528587584 signature_serialization.py:151] Function `_wrapped_model` contains input name(s) args_0 with unsupported characters which will be renamed to args_0_275 in the SavedModel.
W0628 14:55:34.146763 140086528587584 save.py:243] Found untraced functions such as listwise_dense_features_layer_call_and_return_conditional_losses, listwise_dense_features_layer_call_fn, dense_3_layer_call_and_return_conditional_losses, dense_3_layer_call_fn, listwise_dense_features_layer_call_and_return_conditional_losses while saving (showing 5 of 65). These functions will not be directly callable after loading.
INFO:tensorflow:Assets written to: /tmp/tmpte46l_mb/assets
I0628 14:55:41.276194 140086528587584 builder_impl.py:775] Assets written to: /tmp/tmpte46l_mb/assets
2021-06-28 14:55:48.804822: I tensorflow/core/grappler/devices.cc:69] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 0
2021-06-28 14:55:48.804951: I tensorflow/core/grappler/clusters/single_machine.cc:357] Starting new session
2021-06-28 14:55:48.949690: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:1144] Optimization results for grappler item: graph_to_optimize
function_optimizer: function_optimizer did nothing. time = 0.042ms.
function_optimizer: function_optimizer did nothing. time = 0ms.
*** tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot convert a Tensor of dtype resource to a NumPy array.
Could it be related to this answer https://github.com/tensorflow/tensorflow/issues/37441#issuecomment-775747315 ?
Does anyone has any idea how I can reduce the model memory size ?
We suffer from the same issue, it seems to come from the structure of the model (number of ops): for now we tried reducing the number of layers and it seems to have reduced the issue. I'm not sure TF lite would work you wont be able to export it back to savedmodel format.
Did you find anything else on your side ?
Hello @tanguycdls We did not find any solution yet, but it is critical for us. We won't be able to use TFRanking without this feature. We only have 3 layers for the moment which does not seem to be a big number. We will try with one layer just to see, but it is not a viable solution either.
We think we found a workaround but we're still not sure it's viable: we tried transforming our keras models to the old (tf1 format) frozen model that we then re attach to a savedmodel. It seems to reduce the ram.
take a look at this: https://leimao.github.io/blog/Save-Load-Inference-From-TF2-Frozen-Graph/
and then when you have your concrete function reattach it to a tf.Module.
full_model = tf.function(lambda x: model(x))
full_model = full_model.get_concrete_function(
tf.TensorSpec(model.inputs[0].shape, model.inputs[0].dtype)) # you should fix that to the correct input shapes
# Get frozen ConcreteFunction
frozen_func = convert_variables_to_constants_v2(full_model)
frozen_func.graph.as_graph_def()
module = tf.Module()
module.func = frozen_func
tf.savedmodel.save()... # must specify signature
again we're still working on the topic so we dont have any long term view of the solution: there might be an issue somewhere...
If you find an issue or have a better idea please tell us !
and some links: https://github.com/search?q=convert_variables_to_constants_v2&type=code
Hello @tanguycdls Thank you again for your help. Did you find any proper solution ? Yours does not work for us as it raise another exception.
Hello @tanguycdls Thank you again for your help. Did you find any proper solution ? Yours does not work for us as it raise another exception.
We still use the solution above this + some Grappler optims fixed the issue for most domains. You're using a model that cannot be freeze ?
Hello,
I'm trying to upload a model generated with TFRanking (32Mb) to BigQuery which I saved like this:
but I got this error:
Previously I managed to upload bigger model (>200Mb) created with regular TF 1.13 code, so I don't understand the message.
Did someone already encounter this ?
Thanks
On
Ubuntu 18.04
,Python 3.7.3