Unity-Technologies / ml-agents

The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning.
https://unity.com/products/machine-learning-agents
Other
17.19k stars 4.16k forks source link

Need Help Converting .pb / .h5 Tensorflow Model to .nn #2483

Closed weizhuang-93 closed 5 years ago

weizhuang-93 commented 5 years ago

Hi, i would like to use tensorflow trained model for audio source separation in unity. I am trying to convert the model into NN Model with Barracuda but i seems to get error. i tried Onnx to Barracuda, Keras to Barracuda, and Tensorflow to Barracuda but none seems to work. I named my single input and single output.

I really hope someone could help me on this!!

I used: tensorflow-gpu=1.7.1 windows 10 ml-agents0.9.2 python 3.6.9 protobuf 3.9.1

Convert to h5 to protobuf with the following code:

def freeze_session(session, keep_var_names=None, output_names=None, clear_devices=True):
    from tensorflow.python.framework.graph_util import convert_variables_to_constants
    graph = session.graph
    with graph.as_default():
        freeze_var_names = list(set(v.op.name for v in tf.global_variables()).difference(keep_var_names or []))
        output_names = output_names or []
        output_names += [v.op.name for v in tf.global_variables()]
        # Graph -> GraphDef ProtoBuf
        input_graph_def = graph.as_graph_def()
        if clear_devices:
            for node in input_graph_def.node:
                node.device = ""
        frozen_graph = convert_variables_to_constants(session, input_graph_def,
                                                      output_names, freeze_var_names)
        return frozen_graph

import numpy as np
import sys
import os
import tensorflow as tf
from freeze_graph import freeze_session
from tensorflow.python.keras import backend as K
from tensorflow.python.keras.models import load_model
from tensorflow.python.platform import gfile
from tensorflow.python.framework import graph_io
def main(_):

    modellpath= "C:/Users/dh70wie/source/repos/SpeechSeparation01/SpeechSeparation01/output/CNN_TF_1-7-1_BNAxis-1/saved-model-80-0.9950.hdf5"
    modellpath= 'C:\\Users\\dh70wie\\source\\repos\\SpeechSeparation01\\SpeechSeparation01\\output\\CNN_TF_1-7-1_BNAxis-1\\saved-model-80-0.9950.hdf5'
    model = tf.keras.models.load_model(modellpath)
    model.compile(optimizer=tf.keras.optimizers.Adam(lr = 0.001),
                  loss= tf.keras.losses.mean_absolute_error,
                  metrics=['mae' , 'mse'])
    print("Model done Compiling.")
    model.summary()
    model.save('.\\output\\CNN_TF_1-7-1_BNAxis-1.h5')
    model = None

    K.set_learning_phase(0)
    model = load_model('.\\output\\CNN_TF_1-7-1_BNAxis-1.h5')
    print("inputs: ")
    print(model.inputs) 
    print("outputs: ")
    print(model.outputs)
    frozen_graph = freeze_session(K.get_session(),output_names=[out.op.name for out in model.outputs])
    graph_io.write_graph(frozen_graph, "model", "CNN_TF_1.7.1_BNAxis-1.pb", as_text=False)

the model i trained with following code:

model = tf.keras.Sequential()
model.add(layers.Conv2D(16,(3,3),padding = "valid", kernel_initializer ='normal', input_shape=(40, 65,1), name = "input",
                        kernel_regularizer =tf.keras.regularizers.l2(0.01)))
model.add(layers.BatchNormalization(axis=1))
model.add(layers.Activation('relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(32,(3,3),padding = "valid", kernel_initializer ='normal',
                        kernel_regularizer =tf.keras.regularizers.l2(0.01)))
model.add(layers.BatchNormalization(axis=1))
model.add(layers.Activation('relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64,(3,3),padding = "valid", kernel_initializer ='normal',
                        kernel_regularizer =tf.keras.regularizers.l2(0.01)))
model.add(layers.BatchNormalization(axis=1))
model.add(layers.Activation('relu'))
model.add(layers.MaxPooling2D((1, 2)))
model.add(layers.Conv2D(128,(3,3),padding = "valid", kernel_initializer ='normal',
                        kernel_regularizer =tf.keras.regularizers.l2(0.01)))
model.add(layers.BatchNormalization(axis=1))
model.add(layers.Activation('relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dense(256, activation = None))        
model.add(layers.BatchNormalization(axis=1))
model.add(layers.Activation('relu'))
model.add(layers.Dense(2600, activation='sigmoid'))
model.add(layers.Reshape((40,65,1),name = "output"))

model.compile(optimizer=tf.keras.optimizers.Adam(lr = 0.001),
            loss= tf.keras.losses.mean_absolute_error, metrics=['mae' , 'mse'])
model.summary()

        # Train Modell
model.fit(P_mix_train_abs, IRM_train, epochs = 100, batch_size = 1024, callbacks = Callbacks,
                validation_data = (P_mix_val_abs, IRM_val))

is there any problem with my code because i am not sure why i can't convert the model

my model: output.zip

mantasp commented 5 years ago

@weizhuang-93 any reason why are you doing Batch normalisation on axis=1? Typically it should be feature axis (-1) and we support only this case.

weizhuang-93 commented 5 years ago

sorry for the very late reply, i was busy with other things. i changed the model to axis = -1 and the problem still exists

below are my files, .pb, .h5, .bytes(converted from tf, instead of .pb i use .bytes) and .pbtxt CNN_8k_2Conv2Dense_64_16.zip

weizhuang-93 commented 5 years ago

Same error using axis(-1):

(ml-agents) C:\Users\Wei Zhuang>python ./ml-agents/ml-agents/mlagents/trainers/tensorflow_to_barracuda.py D:\MasterArbeit\05_Programm_und_Code\model\CNN_8k_2Conv2Dense_64_16.pb CNN_8k_2Conv2Dense_64_16.nn Converting D:\MasterArbeit\05_Programm_und_Code\model\CNN_8k_2Conv2Dense_64_16.pb to CNN_8k_2Conv2Dense_64_16.nn Sorting model, may take a while... Done! IGNORED: PlaceholderWithDefault unknown layer IGNORED: Switch unknown layer WARNING: rank unknown for tensor batch_normalization/cond/Switch:1 while processing node batch_normalization/cond/switch_t Traceback (most recent call last): File "./ml-agents/ml-agents/mlagents/trainers/tensorflow_to_barracuda.py", line 26, in <module> tf2bc.convert(args.source_file, args.target_file, args.trim_unused_by_output, args) File "C:\Users\Wei Zhuang\ml-agents\ml-agents\mlagents\trainers\tensorflow_to_barracuda.py", line 1552, in convert i_model, args File "C:\Users\Wei Zhuang\ml-agents\ml-agents\mlagents\trainers\tensorflow_to_barracuda.py", line 1397, in process_model process_layer(node, o_context, args) File "C:\Users\Wei Zhuang\ml-agents\ml-agents\mlagents\trainers\tensorflow_to_barracuda.py", line 1150, in process_layer for x in tensor_names File "C:\Users\Wei Zhuang\ml-agents\ml-agents\mlagents\trainers\tensorflow_to_barracuda.py", line 1150, in <listcomp> for x in tensor_names File "C:\Users\Wei Zhuang\ml-agents\ml-agents\mlagents\trainers\tensorflow_to_barracuda.py", line 697, in get_tensor_data return np.array(data).reshape(dims) UnboundLocalError: local variable 'data' referenced before assignment

in case it could help, i tried the keras to Barracuda conversiona and get the following error: Converting .\CNN_8k_2Conv2Dense_64_16\CNN_8k_2Conv2Dense_64_16.h5 to CNN_8k_2Conv2Dense_64_16.bytes IN: '': [None, 40, 65, 1] => 'state' OUT: 'action' Traceback (most recent call last): File ".\barracuda-release-release-0.2.4\Tools\keras_to_barracuda.py", line 21, in <module> keras2bc.convert(args.source_file, args.target_file, args.trim_unused_by_output, args) File "C:\Users\Wei Zhuang\barracuda-release-release-0.2.4\Tools\keras_to_barracuda.py", line 421, in convert barracuda.write(o_model, target_file) File "C:\Users\Wei Zhuang\barracuda-release-release-0.2.4\Tools\barracuda.py", line 499, in write w.write_int32(l.axis) File "C:\Users\Wei Zhuang\barracuda-release-release-0.2.4\Tools\barracuda.py", line 447, in write_int32 self.f.write(struct.pack('<i', d)) struct.error: required argument is not an integer

stale[bot] commented 5 years ago

This issue has been automatically marked as stale because it has not had activity in the last 14 days. It will be closed in the next 14 days if no further activity occurs. Thank you for your contributions.

stale[bot] commented 5 years ago

This issue has been automatically closed because it has not had activity in the last 28 days. If this issue is still valid, please ping a maintainer. Thank you for your contributions.

github-actions[bot] commented 3 years ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.