Closed Gabriel4256 closed 2 years ago
@Gabriel4256 the no operator warning that you see above is likely the result of an unsuccessful compilation of the model. From the description above, it looks like you are compiling the BERT model on a Inf1 instance. For a BERT large model, we recommend that you run the compilation on a a c5.4xlarge instance as described in the tutorial - https://awsdocs-neuron.readthedocs-hosted.com/en/latest/neuron-guide/neuron-frameworks/tensorflow-neuron/tutorials/bert_demo/bert_demo.html#tensorflow-bert-demo under the Launch instances tab. From the description above, it looks like you are compiling the BERT model on a Inf1 instance. Please let us know if addresses your issue
@aws-joshim Execution on c5.4xlarge instance derives almost same result:
WARNING:tensorflow:
The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
* https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
* https://github.com/tensorflow/addons
* https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.
2022-01-05 02:42:24.876594: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2022-01-05 02:42:24.897271: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2999995000 Hz
2022-01-05 02:42:24.897869: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5654d52be700 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2022-01-05 02:42:24.897892: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
WARNING:tensorflow:From /home/ubuntu/anaconda3/envs/aws_neuron_tensorflow_p36/lib/python3.6/site-packages/tensorflow_core/contrib/predictor/saved_model_predictor.py:153: load (from tensorflow.python.saved_model.loader_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.loader.load or tf.compat.v1.saved_model.load. There will be a new function for importing SavedModels in Tensorflow 2.0.
2022-01-05 02:42:40.276883: I tensorflow/core/grappler/devices.cc:60] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 0 (Note: TensorFlow was not compiled with CUDA support)
2022-01-05 02:42:40.277032: I tensorflow/core/grappler/clusters/single_machine.cc:356] Starting new session
2022-01-05 02:42:42.047896: I tensorflow/neuron/grappler/convert/segment.cc:456] There are 1639 ops of 36 different types in the graph that are not compiled by neuron-cc: LogSoftmax, GreaterEqual, RandomUniform, Tanh, ArgMax, Pow, Softmax, BatchMatMul, Fill, Cast, Mul, SquaredDifference, Mean, RealDiv, Transpose, Slice, LessEqual, ExpandDims, Sub, Const, Pack, GatherV2, NoOp, MatMul, BiasAdd, Shape, StridedSlice, Rsqrt, Reshape, Identity, Assert, Placeholder, OneHot, Squeeze, Add, All, (For more information see https://awsdocs-neuron.readthedocs-hosted.com/en/latest/release-notes/neuron-cc-ops/neuron-cc-ops-tensorflow.html).
2022-01-05 02:42:42.062828: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:786] Optimization results for grappler item: graph_to_optimize
2022-01-05 02:42:42.062863: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] aws_neuron_static_shape_inference: Graph size after: 1638 nodes (0), 1934 edges (0), time = 560.523ms.
2022-01-05 02:42:42.062870: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:788] aws_neuron_fuse_supported_operators: Graph size after: 1638 nodes (0), 1934 edges (0), time = 95.084ms.
INFO:tensorflow:Number of operations in TensorFlow session: 8830
INFO:tensorflow:Number of operations after tf.neuron optimizations: 1638
INFO:tensorflow:Number of operations placed on Neuron runtime: 0
WARNING:tensorflow:Converted /home/ubuntu/AWS_Neuron_scripts/tensorflow/bert-2 to ./bert-saved-model-neuron but no operator will be running on AWS machine learning accelerators. This is probably not what you want. Please refer to https://github.com/aws/aws-neuron-sdk for current limitations of the AWS Neuron SDK. We are actively improving (and hiring)!
{'OnNeuronRatio': 0.0}
It might be the problem of the Bert model I used. I created the model using the example on google bert github (https://colab.research.google.com/github/google-research/bert/blob/master/predicting_movie_reviews_with_bert_on_tf_hub.ipynb).
After training, I just added the following code for saving model in a SavedModel format:
features = {
"input_ids": tf.placeholder(shape=[None, FLAGS.max_seq_length], dtype=tf.int32, name='input_ids'),
"input_mask": tf.placeholder(shape=[None, FLAGS.max_seq_length], dtype=tf.int32, name='input_mask'),
"segment_ids": tf.placeholder(shape=[None, FLAGS.max_seq_length], dtype=tf.int32, name='segment_ids'),
"label_ids": tf.placeholder(shape=[None], dtype=tf.int32, name='label_ids'),
"is_real_example": tf.placeholder(shape=[None], dtype=tf.int32, name='is_real_example'),
}
serving_input_fn = tf.estimator.export.build_raw_serving_input_receiver_fn(features)
estimator._export_to_tpu = False
estimator.export_saved_model(
export_dir_base='./bert_classifier_saved_model',
serving_input_receiver_fn=serving_input_fn)
Please let me know if I am missing something. Also, I would appreciate if you share pretrained Bert model that is compiled well with Neuron.
We looked at the attached Colab notebook content. The reason for the incompatible saved model format is that the model from tensorflow-hub contains different operator names, causing compilation to fail. We also noticed that the attached Jupyter notebook points to a tensorflow 2.x tutorial, while the compilation steps mentioned above use the tensorflow-neuron 1.x script. The following TF 2.x compilation scripts with the tensorflow-neuron 2.x version could address your compilation issue -
import argparse import tensorflow as tf import tensorflow.neuron as tfn
def main(): parser = argparse.ArgumentParser() parser.add_argument('--input_saved_model', required=True, help='Original SaveModel') parser.add_argument('--output_saved_model', required=True, help='Output SavedModel that runs on Inferentia') parser.add_argument('--batch_size', type=int, default=1) parser.add_argument('--sequence_length', type=int, default=128) args = parser.parse_args() model = tf.saved_model.load(args.input_saved_model) wfunc = model.signatures['serving_default'] input_ids = tf.zeros([args.batch_size, args.sequence_length], dtype=tf.int32) input_mask = tf.zeros([args.batch_size, args.sequence_length], dtype=tf.int32) segment_ids = tf.zeros([args.batch_size, args.sequence_length], dtype=tf.int32) is_real_example = tf.zeros([args.batch_size], dtype=tf.int32) label_ids = tf.zeros([args.batch_size], dtype=tf.int32) print(wfunc) # to see it's calling signature example_inputs = [input_ids, input_mask, is_real_example, label_ids, segment_ids] wfunc_neuron = tfn.trace(wfunc, example_inputs) signatures = {'serving_default': wfunc_neuron.aws_neuron_function} tf.saved_model.save(model, args.output_saved_model, signatures)
if name == 'main': main()
It works fine, thank you.
Is there any way to compile and run Bert-large model on TF1? I need this because some profiling tools only work on TF 1. I've tried some bert models, but most of the operations were compiled to run on CPU as shown below:
INFO:tensorflow:Number of operations in TensorFlow session: 8830
INFO:tensorflow:Number of operations after tf.neuron optimizations: 1638
INFO:tensorflow:Number of operations placed on Neuron runtime: 26
For this, I've downloaded checkpoint file from google bert github, converted it to TF1 SavedModel forat using the following code:
with tf.Session(graph=tf.Graph()) as sess:
# Initialize v1 since the saver will not.
segment_ids = tf.saved_model.utils.build_tensor_info(tf.constant(np.zeros((1,128))))
input_ids = tf.saved_model.utils.build_tensor_info(tf.constant(np.zeros((1,128))))
input_mask = tf.saved_model.utils.build_tensor_info(tf.constant(np.zeros((1,128))))
# label_ids = tf.saved_model.utils.build_tensor_info(tf.constant(np.zeros((1))))
label = tf.saved_model.utils.build_tensor_info(tf.constant(np.zeros((1))))
loader = tf.compat.v1.train.import_meta_graph('/home/ubuntu/models/wwm_uncased_L-24_H-1024_A-16/bert_model.ckpt.meta')
loader.restore(sess, "/home/ubuntu/models/wwm_uncased_L-24_H-1024_A-16/bert_model.ckpt")
shutil.rmtree("./saved_model_test", ignore_errors=True)
builder = tf.compat.v1.saved_model.builder.SavedModelBuilder("./saved_model_test")
signature_def = tf.compat.v1.saved_model.build_signature_def(
inputs={
"segment_ids": segment_ids,
"input_ids": input_ids,
"input_mask": input_mask,
},
outputs={"label": label})
builder.add_meta_graph_and_variables(sess,
[tf.saved_model.SERVING],
signature_def_map={
tf.saved_model.DEFAULT_SERVING_SIGNATURE_DEF_KEY: signature_def},
strip_default_attrs=True)
builder.save()
And fianlly, I compiled it with the following code:
with tf.Session(graph=tf.Graph()) as sess:
tf.saved_model.loader.load(sess, ['serve'], MODEL_DIR)
result = tfn.saved_model.compile(
args.input_saved_model, args.output_saved_model,
)
I've tried this on c5.4xlarge instance and all the other things are same as the earlier post.
Thanks in advance.
Hi @Gabriel4256, to compile TF BERT large model you will need to follow the steps here. The default compile script doesn't work in this case.
Hi @Gabriel4256, please let us know if you still have problem with compiling TF BERT large model after following the steps mentioned in the previous post. Thanks.
Hi, team. I have some trouble following the TF 1 Bert tutorial here.
Here are my execution environments:
I tried to use pretrained model from https://github.com/google-research/bert, and followed the instructions here.
After setting all the other things properly, I exeucuted the example usage script in the tutorial, which gave me the following result:
According to the message, it seems like no operator is compiled for Inferentia. Is this intended? Please let me know if I am missing something.
And this is the result of pip list:
Thanks in advance.