Closed rcostu closed 7 months ago
Triage note: Is it for LSTM only? does it reproduce without RNN related layer? Also adding @k-w-w from save model side for this question.
Yes, I have replicated the test with the same output after removing the LSTM and no RNN layers.
I see. So this is probably a generic issue for save/loading on the TF stack, @k-w-w is probably the best person for this issue, and I am not so sure how compatible the TF go backend is.
It looks like the go wrapper is calling the C++ API (TF_LoadSessionFromSavedModel
) to load the SavedModel, but when I try calling method directly I don't see the same issue.
Code:
std::string saved_model_dir ="/tmp/test_model_go";
TF_CHECK_OK(tf::Env::Default()->FileExists(saved_model_dir));
tensorflow::SessionOptions session_options;
(*session_options.config.mutable_device_count())["GPU"] = 1;
session_options.config.mutable_gpu_options()
->set_per_process_gpu_memory_fraction(0.5);
TF_SessionOptions* tf_session_options = TF_NewSessionOptions();
TF_Status* tf_status = TF_NewStatus();
TF_Graph* tf_graph = TF_NewGraph();
const char* tags[] = {tf::kSavedModelTagServe};
TF_Session* sess = TF_LoadSessionFromSavedModel(
tf_session_options, /*run_options=*/nullptr, saved_model_dir.c_str(),
tags, /*tags_len=*/1, tf_graph, /*metagraph_buffer=*/nullptr, tf_status);
TF_DeleteSession( sess, tf_status );
Logs:
I0130 20:59:26.284406 188638 reader.cc:83] Reading SavedModel from: /tmp/test_model_go
I0130 20:59:26.284449 188638 merge.cc:149] Reading binary proto from /tmp/test_model_go/saved_model.pb
I0130 20:59:26.285070 188638 merge.cc:152] Finished reading binary proto, took 624 microseconds.
I0130 20:59:26.285089 188638 reader.cc:51] Reading meta graph with tags { serve }
I0130 20:59:26.285094 188638 reader.cc:153] Reading SavedModel debug info (if present) from: /tmp/test_model_go
I0130 20:59:26.285128 188638 cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
I0130 20:59:26.308795 188638 mlir_graph_optimization_pass.cc:388] MLIR V1 optimization pass is not enabled
I0130 20:59:26.309449 188638 loader.cc:234] Restoring SavedModel bundle.
I0130 20:59:26.338187 188638 loader.cc:218] Running initialization op on SavedModel bundle at path: /tmp/test_model_go
I0130 20:59:26.348788 188638 loader.cc:317] SavedModel load for tags { serve }; Status: success: OK. Took 64387 microseconds.
The only things I can think of is that there is an issue with the wrapper, or that there may be a versioning difference. I'm exporting the model at head, and loading the model at head. The tfgo library is using TensorFlow forked at 2.9. @rcostu Could you try exporting with the same TensorFlow version as the fork?
This issue is stale because it has been open for 14 days with no activity. It will be closed if no further activity occurs. Thank you.
This issue was closed because it has been inactive for 28 days. Please reopen if you'd like to work on this further.
Hi,
Redirected this ticket from tfgo repository as the problem seems to be in tensorflow library but as it is related to keras, they redirected me here from issue #63824 in the TF project. The problem documented here happens exactly the same using either tfgo LoadModel function or tf LoadSavedModel function.
I have developed a model that makes a time series forecast receiving data from the last 60 days as floats and returns one float. I am training the model in Python and saving it with the export function as stated in the tf documentation for keras models.
Model definition in python looks like this:
The saved_model_cli show has this output:
And the saved_model_cli run with this command is running as expected.
saved_model_cli run --dir /Users/rcostumero//Downloads/test_model_go --tag_set serve --signature_def serve --input_exprs='lstm_2_input=[[[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1]]]'
However, when running the go code as in the example given it panics when loading the model. The code looks like this:
And the panic shows this error:
I have tried anything that I have come up to and looked for similar cases but I didn't found any solution.
Any help is more than welcomed.
Thanks in advance!