Closed luckyapplehead closed 6 years ago
I fixed the issue by changing the requirement.txt to:
tensorflow==1.3.0
tensorflow-transform==0.3.1
protobuf==3.4.0
and adding install requirement for setup.py as below:
TENSORFLOW = 'tensorflow==1.3.0'
TENSORFLOW_TRANSFORM = 'tensorflow-transform==0.3.1'
PROTOBUF = 'protobuf==3.4.0'
...
install_requires=[TENSORFLOW, TENSORFLOW_TRANSFORM, PROTOBUF])
Then the training process can run successfully. But when I go to the next step, creating the model, the same error occur again...
command I used:
gcloud ml-engine versions create "v1" --model "movielens" --origin "${MODEL_SOURCE}"
Error message:
ERROR: (gcloud.ml-engine.versions.create) Bad model detected with error: "Error loading the model: Could not load model: Loading servable: {name: default version: 1} failed: Not found: Op type not registered 'HashTableV2'\n\n"
Is there anything I can do to fix this problem?
Can you please try the updated setup.py. Also looking at the command about I am assuming this is movielens you are running?
@puneith hi, puneith, could you tell me to updated setup.py with what changes? Yes, I'm running movielens:)
I update the setup.py to the newest version, and the same error occurs when I run the following command:
gcloud ml-engine versions create "v4" --model "movielens" --origin "${MODEL_SOURCE}"
error message:
ERROR: (gcloud.ml-engine.versions.create) Bad model detected with error: "Error loading the model: Could not load model: Loading servable: {name: default version: 1} failed: Not found: Op type not registered 'HashTableV2'\n\n"
@luckyapplehead Since Cloud ML Engine is still on TF1.2 using TF1.3 for training is causing discrepancy between training and prediction. We will have TF1.4 available on Cloud ML Engine very soon so this should go away. In the mean time can you please try training with TF1.2
@puneith I am getting this error with TF 1.2. Also raised a support ticket but no help
@luckyapplehead Can you please try the training with 1.2 and then create the model to see if error persists, which means you will need to make sure the setup.py TF1.3 is commented.
@puneetjindal are you getting the error with TF1.2 for training or prediction? If its for prediction can you please confirm you trained using TF1.2?
@puneith My training with TF 1.2 is working fine.My model gets exported to GCS along with variables folder. Issue comes when I go to create version of the model in GCP ML engine. I trained using TF 1.2 only
@dsdelhi Can I please get your GCP project_id.
@puneith I hope you have noted my project id
@puneetjindal @dsdelhi Sorry for the delay. Are we still seeing this issue on TF1.4 Cloud ML Engine. Can you send me your project_id if you are still seeing the issue. @puneetjindal I don't see your project id in this thread.
I shared the project id but it still exists
On Jan 26, 2018 10:11 PM, "Puneith Kaul" notifications@github.com wrote:
@puneetjindal https://github.com/puneetjindal @dsdelhi https://github.com/dsdelhi Sorry for the delay. Are we still seeing this issue on TF1.4 Cloud ML Engine. Can you send me your project_id if you are still seeing the issue. @puneetjindal https://github.com/puneetjindal I don't see your project id in this thread.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/GoogleCloudPlatform/cloudml-samples/issues/104#issuecomment-360836947, or mute the thread https://github.com/notifications/unsubscribe-auth/AgoaO9M2j20OGCt94jSz9dANkirp-mYsks5tOgBTgaJpZM4QJ7YB .
@dsdelhi Where did you share it?
I have the same problem, although I see that GC MLE now uses TF 1.4
I solved this by deploying with an explicit argument --runtime-version 1.4 did it for me.
The default runtime_version for CMLE is still 1.0 and you need to specify the runtime-version as @girijaravishankar mentions in comment above. Anyone still facing the issue please reopen this.
For problems running the sample code please provide the following information.
System information
Describe the problem
I run the code and the command the same as the guide line, to train the DNN model on the server -- Google CloudML Engine. But the error occur as below: "No op named HashTableV2 in defined operations" I also try 'tensorflow-transform==0.1.10', 'tensorflow==1.2.0' as described in the requirement.txt and setup.py, but the same error occur. It seems like the env requirement is not correct to run this example. @elmer-garduno
Source code / logs
The entire error logs are as below:
Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 813, in <module> main() File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 809, in main output_dir=output_path) File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/learn_runner.py", line 106, in run return task() File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/experiment.py", line 465, in train_and_evaluate export_results = self._maybe_export(eval_result) File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/experiment.py", line 484, in _maybe_export compat.as_bytes(strategy.name)))) File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/export_strategy.py", line 32, in export return self.export_fn(estimator, export_path) File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/utils/saved_model_export_utils.py", line 283, in export_fn exports_to_keep=exports_to_keep) File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/framework/python/framework/experimental.py", line 64, in new_func return func(*args, **kwargs) File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 1258, in export_savedmodel input_ops = input_fn() File "/root/.local/lib/python2.7/site-packages/tensorflow_transform/saved/input_fn_maker.py", line 46, in serving_input_fn receiver = receiver_fn() File "/root/.local/lib/python2.7/site-packages/tensorflow_transform/saved/input_fn_maker.py", line 375, in parsing_transforming_serving_input_receiver_fn transform_savedmodel_dir, raw_features)) File "/root/.local/lib/python2.7/site-packages/tensorflow_transform/saved/saved_transform_io.py", line 248, in partially_apply_saved_transform saved_model_dir, logical_input_map, tensor_replacement_map) File "/root/.local/lib/python2.7/site-packages/tensorflow_transform/saved/saved_transform_io.py", line 142, in _partially_apply_saved_transform_impl input_map=input_map) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1566, in import_meta_graph **kwargs) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/meta_graph.py", line 498, in import_scoped_meta_graph producer_op_list=producer_op_list) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/importer.py", line 260, in import_graph_def raise ValueError('No op named %s in defined operations.' % node.op) ValueError: No op named HashTableV2 in defined operations.
`