aws-neuron / aws-neuron-sdk

Powering AWS purpose-built machine learning chips. Blazing fast and cost effective, natively integrated into PyTorch and TensorFlow and integrated with your favorite AWS services
https://aws.amazon.com/machine-learning/neuron/
Other
444 stars 148 forks source link

Failed to fuse subgraph #82

Closed thias42 closed 4 years ago

thias42 commented 4 years ago

Hi, I am trying to convert a pre-trained tensorflow model for running on AWS Inferentia. The neuron compiler successfully converts the model, but gives warnings about failing to fuse subgraphs. The Number of operations placed on Neuron runtime is 0. Inference on a Inf1 instance is therefore much slower than it should. Do you have any tips how to solve this?

Running this compile script on a Ubuntu DLAMI Version 26 in the aws_neuron_tensorflow_p36 environment

import os
import time
import shutil
import tensorflow as tf
import tensorflow.neuron as tfn
import tensorflow.compat.v1.keras as keras
import openl3

WORKSPACE = './ws_openl3'
os.makedirs(WORKSPACE, exist_ok=True)
model_dir = os.path.join(WORKSPACE, 'openl3')
compiled_model_dir = os.path.join(WORKSPACE, 'openl3_neuron')
shutil.rmtree(model_dir, ignore_errors=True)
shutil.rmtree(compiled_model_dir, ignore_errors=True)

keras.backend.set_learning_phase(0)
model = openl3.models.load_audio_embedding_model(input_repr="mel256", content_type="music", embedding_size=512)

tf.saved_model.simple_save(
    session = keras.backend.get_session(),
    export_dir = model_dir,
    inputs = {'input': model.inputs[0]},
    outputs = {'output': model.outputs[0]})

tfn.saved_model.compile(model_dir, compiled_model_dir)    
shutil.make_archive('./openl3_neuron', 'zip', WORKSPACE, 'openl3_neuron')

produces the following output:

INFO:tensorflow:Restoring parameters from ./ws_openl3/openl3/variables/variables
INFO:tensorflow:Froze 51 variables.
INFO:tensorflow:Converted 51 variables to const ops.
2020-02-20 17:46:35.530084: I tensorflow/compiler/tf2tensorrt/segment/segment.cc:460] There are 4 ops of 3 different types in the graph that are not compiled by neuron-cc: Unpack, NoOp, Placeholder, (For more information see https://github.com/aws/aws-neuron-sdk/blob/master/release-notes/neuron-cc-ops/neuron-cc-ops-tensorflow.md).
INFO:tensorflow:fusing subgraph neuron_op_5c0465282a95dec5 with neuron-cc
WARNING:tensorflow:Failed to fuse subgraph neuron_op_5c0465282a95dec5 with '/home/ubuntu/anaconda3/envs/aws_neuron_tensorflow_p36/bin/neuron-cc compile /tmp/tmpiq09lm81/neuron_op_5c0465282a95dec5/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /tmp/tmpiq09lm81/neuron_op_5c0465282a95dec5/graph_def.neff --io-config "{\"inputs\": {\"melspectrogram_1/transpose_20/_0:0\": [[1, 1, 199, 1025], \"float32\"], \"melspectrogram_1/unstack0/_1:0\": [[], \"int32\"]}, \"outputs\": [\"flatten_1/Reshape:0\"]}"'
INFO:tensorflow:fusing subgraph neuron_op_879ef434f1d5fcf0 with neuron-cc
WARNING:tensorflow:Failed to fuse subgraph neuron_op_879ef434f1d5fcf0 with '/home/ubuntu/anaconda3/envs/aws_neuron_tensorflow_p36/bin/neuron-cc compile /tmp/tmpiq09lm81/neuron_op_879ef434f1d5fcf0/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /tmp/tmpiq09lm81/neuron_op_879ef434f1d5fcf0/graph_def.neff --io-config "{\"inputs\": {\"input_10/_2:0\": [[1, 1, 48000], \"float32\"]}, \"outputs\": [\"melspectrogram_1/transpose_2:0\", \"melspectrogram_1/Shape:0\"]}"'
INFO:tensorflow:Number of operations in TensorFlow session: 1193
INFO:tensorflow:Number of operations after tf.neuron optimizations: 151
INFO:tensorflow:Number of operations placed on Neuron runtime: 0
INFO:tensorflow:No assets to save.
INFO:tensorflow:No assets to write.
INFO:tensorflow:SavedModel written to: ./ws_openl3/openl3_neuron/saved_model.pb
INFO:tensorflow:Successfully converted ./ws_openl3/openl3 to ./ws_openl3/openl3_neuron
micwade-aws commented 4 years ago

Hi, thias42.

We’re going to take a look into this soon. Please double check that you’ve updated all of Neuron SDK and are running the latest ingredients. Instructions for updating Neuron are available here: https://github.com/aws/aws-neuron-sdk/blob/master/docs/neuron-install-guide.md

thias42 commented 4 years ago

We’re going to take a look into this soon. Please double check that you’ve updated all of Neuron SDK and are running the latest ingredients. Instructions for updating Neuron are available here: https://github.com/aws/aws-neuron-sdk/blob/master/docs/neuron-install-guide.md

Yes, the Neuron SDK and tools are up-to-date.

jeffhataws commented 4 years ago

Thanks thias42. We have reproduced your issue and are investigating it.

jeffhataws commented 4 years ago

Currently compiler is unable to compile certain configurations due to tensor size limitations. We are aware of the limitations of the compiler and working to remove those limitations. For more information, please see release notes at https://github.com/aws/aws-neuron-sdk/tree/master/release-notes. For now, if you change the input representation (input_repr) to "mel128", and use compile option "no_fuse_ops=['melspectrogram/mul']" you will be able to get 99 operations out of 151 (inference optimized) operations to be placed on Neuron runtime.

model = openl3.models.load_audio_embedding_model(input_repr="mel128", content_type="music", embedding_size=512)
tf.saved_model.simple_save(
    session = keras.backend.get_session(),
    export_dir = model_dir,
    inputs = {'input': model.inputs[0]},
    outputs = {'output': model.outputs[0]})
tfn.saved_model.compile(model_dir, compiled_model_dir, no_fuse_ops=["melspectrogram/mul"])

You will see that the framework partitions the graph into 3 subgraph, but only one subgraph is compiled successfully (the portion after melspectrogram/mul):

INFO:tensorflow:fusing subgraph neuron_op_d4374713206b859f with neuron-cc
INFO:tensorflow:fusing subgraph neuron_op_2fdb7316ce04901d with neuron-cc
INFO:tensorflow:fusing subgraph neuron_op_bc8fe16a32d9e22a with neuron-cc
WARNING:tensorflow:Failed to fuse subgraph neuron_op_2fdb7316ce04901d with '/home/ubuntu/test_venv/bin/neuron-cc compile /tmp/tmp63uejh3m/neuron_op_2fdb7316ce04901d/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /tmp/tmp63uejh3m/neuron_op_2fdb7316ce04901d/graph_def.neff --io-config "{\"inputs\": {\"melspectrogram/transpose_20/_1:0\": [[1, 1, 199, 1025], \"float32\"], \"melspectrogram/unstack0/_2:0\": [[], \"int32\"]}, \"outputs\": [\"melspectrogram/Log:0\"]}"'
WARNING:tensorflow:Failed to fuse subgraph neuron_op_bc8fe16a32d9e22a with '/home/ubuntu/test_venv/bin/neuron-cc compile /tmp/tmp63uejh3m/neuron_op_bc8fe16a32d9e22a/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /tmp/tmp63uejh3m/neuron_op_bc8fe16a32d9e22a/graph_def.neff --io-config "{\"inputs\": {\"input_10/_3:0\": [[1, 1, 48000], \"float32\"]}, \"outputs\": [\"melspectrogram/transpose_2:0\", \"melspectrogram/Shape:0\"]}"'
INFO:tensorflow:Number of operations in TensorFlow session: 790
INFO:tensorflow:Number of operations after tf.neuron optimizations: 151
INFO:tensorflow:Number of operations placed on Neuron runtime: 99

Please let us know if further assistance is needed. You can also file AWS support ticket or contact us directly at aws-neuron-support@amazon.com.

thias42 commented 4 years ago

Thank you! I'll stick to mel128 for now. Looking forward to a version without the tensor size limitations.