Closed vinayak-shanawad closed 2 years ago
Hi @Vinayaks117, The error message in your log shows Compile command returned: -9
. This message typically indicates that the compiler process was killed. Normally this is due to the the OOM (out of memory) killer (run by the linux operating system) killing the compilation process due to memory exhaustion. The most recent version of torch-neuron should provide an updated message for -9
errors that reflects the typical cause for this failure mode.
We recommend you try compiling on an instance with more memory, such as an inf1.6xlarge. Note: you only need the larger instance for compilation; you can still use a smaller instance (such as an inf1.xlarge) to run inference.
Please let us know if compiling on a larger instance resolved the error you’re seeing.
Thanks for the advice @hannanjgaws it works.
But what I observed is there are lot of misclassifications from Neuron model as comparted to fine-tuned BERT model. Hence we can't productionalize that neuron model.
Any idea why there is difference in performance? I believe there might be issue in converting fine-tuned BERT model to AWS Neuron model.
Please check if you can help on this issue.
Conversion logs for your reference:
INFO:Neuron:There are 3 ops of 1 different types in the TorchScript that are not compiled by neuron-cc: aten::embedding, (For more information see https://github.com/aws/aws-neuron-sdk/blob/master/release-notes/neuron-cc-ops/neuron-cc-ops-pytorch.md)
INFO:Neuron:Number of arithmetic operators (pre-compilation) before = 565, fused = 548, percent fused = 96.99%
INFO:Neuron:Number of neuron graph operations 1601 did not match traced graph 1323 - using heuristic matching of hierarchical information
WARNING:tensorflow:From /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/torch_neuron/ops/aten.py:2022: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
INFO:Neuron:Compiling function _NeuronGraph$698 with neuron-cc
INFO:Neuron:Compiling with command line: '/home/ec2-user/anaconda3/envs/python3/bin/neuron-cc compile /tmp/tmp3o4t_86z/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /tmp/tmp3o4t_86z/graph_def.neff --io-config {"inputs": {"0:0": [[1, 128, 768], "float32"], "1:0": [[1, 1, 1, 128], "float32"]}, "outputs": ["Linear_5/aten_linear/Add:0"]} --verbose 35'
INFO:Neuron:skip_inference_context for tensorboard symbols at /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/torch_neuron/tensorboard.py:305 tb_parse
INFO:Neuron:Number of neuron graph operations 1601 did not match traced graph 1323 - using heuristic matching of hierarchical information
INFO:Neuron:Number of arithmetic operators (post-compilation) before = 565, compiled = 548, percent compiled = 96.99%
INFO:Neuron:The neuron partitioner created 1 sub-graphs
INFO:Neuron:Neuron successfully compiled 1 sub-graphs, Total fused subgraphs = 1, Percent of model sub-graphs successfully compiled = 100.0%
INFO:Neuron:Compiled these operators (and operator counts) to Neuron:
INFO:Neuron: => aten::Int: 96
INFO:Neuron: => aten::add: 36
INFO:Neuron: => aten::contiguous: 12
INFO:Neuron: => aten::div: 12
INFO:Neuron: => aten::dropout: 38
INFO:Neuron: => aten::gelu: 12
INFO:Neuron: => aten::layer_norm: 25
INFO:Neuron: => aten::linear: 74
INFO:Neuron: => aten::matmul: 24
INFO:Neuron: => aten::permute: 48
INFO:Neuron: => aten::select: 1
INFO:Neuron: => aten::size: 96
INFO:Neuron: => aten::slice: 1
INFO:Neuron: => aten::softmax: 12
INFO:Neuron: => aten::tanh: 1
INFO:Neuron: => aten::transpose: 12
INFO:Neuron: => aten::view: 48
INFO:Neuron:Not compiled operators (and operator counts) to Neuron:
INFO:Neuron: => aten::Int: 1 [supported]
INFO:Neuron: => aten::add: 3 [supported]
INFO:Neuron: => aten::embedding: 3 [not supported]
INFO:Neuron: => aten::mul: 1 [supported]
INFO:Neuron: => aten::rsub: 1 [supported]
INFO:Neuron: => aten::size: 1 [supported]
INFO:Neuron: => aten::slice: 4 [supported]
INFO:Neuron: => aten::to: 1 [supported]
INFO:Neuron: => aten::unsqueeze: 2 [supported]
INFO:Neuron:skip_inference_context for tensorboard symbols at /home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/torch_neuron/tensorboard.py:305 tb_parse
INFO:Neuron:Number of neuron graph operations 61 did not match traced graph 105 - using heuristic matching of hierarchical information
Hi @Vinayaks117, to maximize numerical accuracy you can try using the --fast-math none
compiler flag. If you find that this achieves your accuracy goals, you can tune the compilation options according to the documentation here: https://awsdocs-neuron.readthedocs-hosted.com/en/latest/neuron-guide/appnotes/perf/mixed-precision.html.
If using these compiler flags doesn’t help, would it be possible for you to share your model with us, so that we can recreate the issue and debug it with you? (Feel free to share directly to aws-neuron-support@amazon.com, if that’s easier than posting here).
If sharing your model is not a possibility, can you point us to an open source model with a similar architecture to your model?
Hello @hannanjgaws
We tried using the "--fast-math none" compiler flag but there are still a lot of misclassification errors.
As requested I have shared the model artifacts via email. Please have a look. Thanks
Thank you for sending your model artifacts. We will take a look at reproducing the accuracy issues and will provide updates on this ticket.
Hi @hannanjgaws
Any updates please? Thanks
Hello @Vinayaks117,
It appears that you are using conda, but installing packages via pip. This is known to cause version mismatch issues in some cases. I was able to successfully compile the model you sent us using the 'conda_mxnet_p37' kernel (Ignore the conda in the name, it is unused) in a SageMaker notebook. Below is the associated .ipynb file:
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"id": "ae2702c8",
"metadata": {},
"outputs": [],
"source": [
"import sys"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "fca94bab",
"metadata": {},
"outputs": [],
"source": [
"!{sys.executable} -m pip install \"torch-neuron==1.8.1.*\" \"neuron-cc[tensorflow]\" \"protobuf<4\" torchvision \"sagemaker>=2.79.0\" \"transformers==4.17.0\""
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8b2be64b",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import tensorflow # to workaround a protobuf version conflict issue\n",
"import torch\n",
"import torch.neuron\n",
"from transformers import AutoTokenizer, AutoModelForSequenceClassification\n",
"\n",
"model_path = 'model/' # Model artifacts are stored in 'model/' directory\n",
"\n",
"# load tokenizer and model\n",
"tokenizer = AutoTokenizer.from_pretrained(model_path)\n",
"model = AutoModelForSequenceClassification.from_pretrained(model_path, torchscript=True)\n",
"\n",
"# create dummy input for max length 128\n",
"dummy_input = \"dummy input which will be padded later\"\n",
"max_length = 128\n",
"embeddings = tokenizer(dummy_input, max_length=max_length, padding=\"max_length\", truncation=True, return_tensors=\"pt\")\n",
"neuron_inputs = tuple(embeddings.values())\n",
"\n",
"# compile model with torch.neuron.trace and update config\n",
"model_neuron = torch.neuron.trace(model, neuron_inputs, compiler_workdir='.')\n",
"model.config.update({\"traced_sequence_length\": max_length})\n",
"\n",
"# save tokenizer, neuron model and config for later use\n",
"save_dir=\"tmpd\"\n",
"os.makedirs(\"tmpd\",exist_ok=True)\n",
"model_neuron.save(os.path.join(save_dir,\"neuron_model.pt\"))\n",
"tokenizer.save_pretrained(save_dir)\n",
"model.config.save_pretrained(save_dir)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e1645c70",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "conda_mxnet_p37",
"language": "python",
"name": "conda_mxnet_p37"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.10"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Hi Team,
I have a fine-tuned BERT model which was trained using following libraries. torch == 1.8.1+cu111 transformers == 4.19.4
And not able to convert that fine-tuned BERT model into AWS neuron and getting following compilation errors. Could you please help me to resolve this issue?
Note: Trying to compile BERT model on SageMaker notebook instance and with "conda_python3" conda environment.
Installation:
Set Pip repository to point to the Neuron repository
!pip config set global.extra-index-url https://pip.repos.neuron.amazonaws.com
Install Neuron PyTorch - Note: Tried both options below.
"#!pip install torch-neuron==1.8.1.* neuron-cc[tensorflow] "protobuf<4" torchvision sagemaker>=2.79.0 transformers==4.17.0 --upgrade" !pip install --upgrade torch-neuron neuron-cc[tensorflow] "protobuf<4" torchvision
Model compilation:
Model artifacts: We have got this model artifacts from multi-label topic classification model.
Error logs:
Thanks a lot.