Closed subhamkhemka closed 3 years ago
Hi Subham,
You can find a sample of how to compile and reference a Neuron model on a HuggingFace pipeline on our Tensorflow2 Tutorial ( https://awsdocs-neuron.readthedocs-hosted.com/en/latest/src/examples/tensorflow/huggingface_bert/huggingface_bert.html ). You can use the same strategy for defining the compiled model on PyTorch. Please let us know if this answers your question!
Hi
Getting an error when deploying the model in the sample you have shared
#now we can insert the neuron_model and replace the cpu model
#so now we have a huggingface pipeline that uses and underlying neuron model!
neuron_pipe.model = neuron_model
neuron_pipe.model.config = pipe.model.config
#Now let's run inference on neuron!
neuron_pipe('I want this sentence to be negative to show a negative sentiment analysis.')
UnavailableError: 2 root error(s) found.
(0) Unavailable: grpc server unix:/run/neuron.sock is unavailable. Please check the status of neuron-rtd service by `systemctl is-active neuron-rtd`. If it shows `inactive`, please install the service by `sudo apt-get install aws-neuron-runtime`. If `aws-neuron-runtime` is already installed, you may activate neuron-rtd service by `sudo systemctl restart neuron-rtd`.
[[node neuron_op_10d5affb7a47741c (defined at /home/ubuntu/neuron_tf2_env/lib/python3.6/site-packages/tensorflow_neuron/python/_trace.py:456) ]]
(1) Unavailable: grpc server unix:/run/neuron.sock is unavailable. Please check the status of neuron-rtd service by `systemctl is-active neuron-rtd`. If it shows `inactive`, please install the service by `sudo apt-get install aws-neuron-runtime`. If `aws-neuron-runtime` is already installed, you may activate neuron-rtd service by `sudo systemctl restart neuron-rtd`.
[[node neuron_op_10d5affb7a47741c (defined at /home/ubuntu/neuron_tf2_env/lib/python3.6/site-packages/tensorflow_neuron/python/_trace.py:456) ]]
[[neuron_op_10d5affb7a47741c/_6]]
0 successful operations.
0 derived errors ignored. [Op:__inference_pruned_9321]
Function call stack:
pruned -> pruned
tried running below cmds
sudo apt-get install aws-neuron-runtime
Reading package lists... Done
Building dependency tree
Reading state information... Done
aws-neuron-runtime is already the newest version (1.6.5.0).
The following packages were automatically installed and are no longer required:
libaio1 librados2 librbd1
Use 'sudo apt autoremove' to remove them.
0 upgraded, 0 newly installed, 0 to remove and 25 not upgraded.
systemctl is-active neuron-rtd
inactive
sudo systemctl restart neuron-rtd
systemctl is-active neuron-rtd
inactive
I am running this using the Python (Neuron TensorFlow 2) Kernel
Please help
Thanks, Subham
It's possible the runtime is not starting because the driver is not active. Do you mind uninstalling and re-installing the aws-neuron-dkms (driver) package on your instance? Updates to the Linux kernel require reinstallation of our aws-neuron-dkms package, and this is the most likely issue. If re-installing the driver does not fix the problem, please post the output of these commands to help us debug the issue further: lsmod | grep neuron sudo systemctl status neuron-rtd Additional details on this debug topic can be found here along with other potentially useful troubleshooting info related to the runtime: https://awsdocs-neuron.readthedocs-hosted.com/en/latest/neuron-guide/neuron-runtime/nrt-troubleshoot.html#neuron-services-fail-to-start
Wasn't able to fix the issue in existing ami so I have taken a fresh instance with Ubuntu DLAMI version 48 and run the sentiment analysis example works fine now.
Getting an error when replicating this for zero shot classification task Updated to latest version of transformers
from transformers import pipeline
import tensorflow as tf
import tensorflow.neuron as tfn
model_name = 'facebook/bart-large-mnli'
pipe = pipeline('zero-shot-classification', model=model_name, framework='tf')
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-5-4c9dfcd1ef90> in <module>
5 model_name = 'facebook/bart-large-mnli'
6
----> 7 pipe = pipeline('zero-shot-classification', model=model_name, framework='tf')
~/neuron_tf2_env/lib/python3.6/site-packages/transformers/pipelines/__init__.py in pipeline(task, model, config, tokenizer, feature_extractor, framework, revision, use_fast, use_auth_token, model_kwargs, **kwargs)
433 revision=revision,
434 task=task,
--> 435 **model_kwargs,
436 )
437
~/neuron_tf2_env/lib/python3.6/site-packages/transformers/pipelines/base.py in infer_framework_load_model(model, config, model_classes, task, framework, **model_kwargs)
141
142 if isinstance(model, str):
--> 143 raise ValueError(f"Could not load model {model} with any of the following classes: {class_tuple}.")
144
145 framework = "tf" if model.__class__.__name__.startswith("TF") else "pt"
ValueError: Could not load model facebook/bart-large-mnli with any of the following classes: (<class 'transformers.models.auto.modeling_tf_auto.TFAutoModelForSequenceClassification'>,).
Any suggestions ?
Our runtime team experts are looking into the issue. In the meantime could you please post your Inf1 instance id, start and stop the instance, the rerun? If the issue persists please reach to us directly by e-mail at aws-neuron-support@amazon.com
We have raised a support case with our TAM, will update the issue here once we here from them.
Thanks, Subham
This topic has been picked up via the customer's account manager, however for other customers I wanted to share some sample code (prepared by another member of the team) which may help other customers:
from transformers import pipeline
import tensorflow as tf
import tensorflow.neuron as tfn
import time
#model_name = 'facebook/bart-large-mnli'
# 'typeform/distilbert-base-uncased-mnli' is unsupported
# (see SUPPORTED_TASKS in https://huggingface.co/transformers/_modules/transformers/pipelines.html)
#model_name = 'typeform/distilbert-base-uncased-mnli'
# Choosing supported model for 'zero-shot-classification' task
model_name = 'roberta-large-mnli'
pipe = pipeline('zero-shot-classification', model=model_name, framework='tf')
sequence_to_classify = "one day I will see the world"
# 52 labels (classes)
# 1 million sequences
# 128 seqlen (varies 5 to 128)
# stats for typeform/distilbert-base-uncased-mnli
# g4dn.2xlarge: 5 minutes 24 sec for 10k sequences
# p2.xlarge: 11 minutes 42 sec for 10k sequences
candidate_labels = ['travel', 'cooking', 'dancing']
start = time.time()
print(pipe(sequence_to_classify, candidate_labels))
print("CPU infer time: ", time.time() - start)
# On g4 machine, DLAMI v48, pytorch_p36
#(pytorch_p36) ubuntu@ip-172-31-6-163:~/github$ python test.py
#{'sequence': 'one day I will see the world', 'labels': ['travel', 'dancing', 'cooking'], 'scores': [0.9938651323318481, 0.003273785812780261, 0.002861040411517024]}
neuron_pipe = pipeline('zero-shot-classification', model='roberta-large-mnli', framework='tf')
#the first step is to modify the underlying tokenizer to create a static
#input shape as inferentia does not work with dynamic input shapes
original_tokenizer = pipe.tokenizer
#we intercept the function call to the original tokenizer
#and inject our own code to modify the arguments
def wrapper_function(*args, **kwargs):
kwargs['padding'] = 'max_length'
#this is the key line here to set a static input shape
#so that all inputs are set to a len of 128
kwargs['max_length'] = 128
kwargs['truncation'] = True
kwargs['return_tensors'] = 'tf'
return original_tokenizer(*args, **kwargs)
#insert our wrapper function as the new tokenizer as well
#as reinserting back some attribute information that was lost
#when we replaced the original tokenizer with our wrapper function
neuron_pipe.tokenizer = wrapper_function
neuron_pipe.tokenizer.decode = original_tokenizer.decode
neuron_pipe.tokenizer.mask_token_id = original_tokenizer.mask_token_id
neuron_pipe.tokenizer.pad_token_id = original_tokenizer.pad_token_id
neuron_pipe.tokenizer.convert_ids_to_tokens = original_tokenizer.convert_ids_to_tokens
#Now that our neuron_classifier is ready we can use it to
#generate an example input which is needed to compile the model
#note that pipe.model is the actual underlying model itself which
#is what Tensorflow Neuron actually compiles.
example_inputs = neuron_pipe.tokenizer('we can use any string here to generate example inputs')
#compile the model by calling tfn.trace by passing in the underlying model
#and the example inputs generated by our updated tokenizer
start = time.time()
neuron_model = tfn.trace(pipe.model, example_inputs)
print("Neuron compile time: ", time.time() - start)
#saved_model_dir = './neuron-' + model_name
#neuron_model.save(saved_model_dir)
#tf.keras.models.load_model(saved_model_dir)
#now we can insert the neuron_model and replace the cpu model
#so now we have a huggingface pipeline that uses and underlying neuron model!
neuron_pipe.model = neuron_model
neuron_pipe.model.config = pipe.model.config
start = time.time()
print(neuron_pipe(sequence_to_classify, candidate_labels))
print("Neuron infer time: ", time.time() - start)
While saving the model in local disk :: neuron_model.save(saved_model_dir)
throw error:: tensorflow.python.saved_model.nested_structure_coder.NotEncodableError
Error stack trace:
File “<stdin>“, line 1, in <module> File “/home/ec2-user/anaconda3/envs/aws_neuron_tensorflow2_p38/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py”, line 2132, in save save.save_model(self, filepath, overwrite, include_optimizer, save_format, File “/home/ec2-user/anaconda3/envs/aws_neuron_tensorflow2_p38/lib/python3.8/site-packages/tensorflow/python/keras/saving/save.py”, line 150, in save_model saved_model_save.save(model, filepath, overwrite, include_optimizer, File “/home/ec2-user/anaconda3/envs/aws_neuron_tensorflow2_p38/lib/python3.8/site-packages/tensorflow/python/keras/saving/saved_model/save.py”, line 89, in save saved_nodes, node_paths = save_lib.save_and_return_nodes( File “/home/ec2-user/anaconda3/envs/aws_neuron_tensorflow2_p38/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py”, line 1268, in save_and_return_nodes _build_meta_graph(obj, signatures, options, meta_graph_def)) File “/home/ec2-user/anaconda3/envs/aws_neuron_tensorflow2_p38/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py”, line 1441, in _build_meta_graph return _build_meta_graph_impl(obj, signatures, options, meta_graph_def) File “/home/ec2-user/anaconda3/envs/aws_neuron_tensorflow2_p38/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py”, line 1405, in _build_meta_graph_impl object_graph_proto = _serialize_object_graph(saveable_view, File “/home/ec2-user/anaconda3/envs/aws_neuron_tensorflow2_p38/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py”, line 967, in _serialize_object_graph serialized = function_serialization.serialize_concrete_function( File “/home/ec2-user/anaconda3/envs/aws_neuron_tensorflow2_p38/lib/python3.8/site-packages/tensorflow/python/saved_model/function_serialization.py”, line 73, in serialize_concrete_function nested_structure_coder.encode_structure( File “/home/ec2-user/anaconda3/envs/aws_neuron_tensorflow2_p38/lib/python3.8/site-packages/tensorflow/python/saved_model/nested_structure_coder.py”, line 103, in encode_structure return _map_structure(nested_structure, _get_encoders()) File “/home/ec2-user/anaconda3/envs/aws_neuron_tensorflow2_p38/lib/python3.8/site-packages/tensorflow/python/saved_model/nested_structure_coder.py”, line 85, in _map_structure return do(pyobj, recursion_fn) File “/home/ec2-user/anaconda3/envs/aws_neuron_tensorflow2_p38/lib/python3.8/site-packages/tensorflow/python/saved_model/nested_structure_coder.py”, line 188, in do_encode encoded_tuple.tuple_value.values.add().CopyFrom(encode_fn(element)) File “/home/ec2-user/anaconda3/envs/aws_neuron_tensorflow2_p38/lib/python3.8/site-packages/tensorflow/python/saved_model/nested_structure_coder.py”, line 85, in _map_structure return do(pyobj, recursion_fn) File “/home/ec2-user/anaconda3/envs/aws_neuron_tensorflow2_p38/lib/python3.8/site-packages/tensorflow/python/saved_model/nested_structure_coder.py”, line 188, in do_encode encoded_tuple.tuple_value.values.add().CopyFrom(encode_fn(element)) File “/home/ec2-user/anaconda3/envs/aws_neuron_tensorflow2_p38/lib/python3.8/site-packages/tensorflow/python/saved_model/nested_structure_coder.py”, line 86, in _map_structure raise NotEncodableError( tensorflow.python.saved_model.nested_structure_coder.NotEncodableError: No encoder for object {‘input_ids’: TensorSpec(shape=(None, 128), dtype=tf.int32, name=‘input_ids’), ‘attention_mask’: TensorSpec(shape=(None, 128), dtype=tf.int32, name=‘attention_mask’)} of type <class ‘transformers.tokenization_utils_base.BatchEncoding’>.
@aws-taylor , Please can you help us in resolving this.
Thank you, VS
Hi
I want to run a zero shot classification task. I am using the huggingface transformers pipeline for this task.
How do I use this pipeline to accelerate inference with torch_neuron ?
I have read the docs but not sure of how I would complile and load the model when directly referencing a transformers pipeline.
Kindly help.
Thanks, Subham