Open praeclarum opened 3 years ago
How are you converting the model? Please share your code which uses coremltools.
Hello, thank you for getting back to me. My example is a little long so I wrote this minimal repro that reproduces the same error.
import tensorflow as tf
import tensorflow.keras as keras
x = keras.layers.Input(shape=(3, 5))
y = keras.layers.MultiHeadAttention(num_heads=7, key_dim=11)(x, x),
model = keras.models.Model(inputs=[x], outputs=[y])
import coremltools as ct
mlmodel = ct.convert(model)
The error with this example is:
ValueError: Cannot add const [3*is0, 5]
Thanks for the smaller example. That's very helpful. I can reproduce this issue using TensorFlow 2.5
However the latest version of TensorFlow that we support is 2.3.1. It looks like keras.layers.MultiHeadAttention
isn't in that version of TensorFlow.
I'll keep this issue open so we can fix it once we support a version of TensorFlow that has it.
Thanks.
It's unfortunate that the alternatives don't convert either. For example, when I use tensorflow_addons:
import tensorflow as tf
import tensorflow_addons as tfa
import tensorflow.keras as keras
x = keras.layers.Input(shape=(3, 5))
y = tfa.layers.MultiHeadAttention(num_heads=7, head_size=11)([x, x]),
model = keras.models.Model(inputs=[x], outputs=[y])
model.summary()
import coremltools as ct
mlmodel = ct.convert(model)
I get this conversion error:
('Einsum unsupported equation format: ', '...NI,HIO->...NHO')
Any chance we could get support for that equation?
Alternatively, do you know of any multi-head attention libraries that work with CoreML tools?
Hi @TobyRoseman , Any update for this issue? This error still reproduce.
Sorry @netanellavisdris - no updates.
This is related to #1537.
Since coremltools 6.3 supports flexible shape einsum
, this issue can be resolved.
The demo code still fails with 6.3. Although the error is different:
ValueError Traceback (most recent call last)
Cell In[1], line 11
8 model.summary()
10 import coremltools as ct
---> 11 mlmodel = ct.convert(model)
File ~/miniconda3/envs/prod/lib/python3.10/site-packages/coremltools/converters/_converters_entry.py:492, in convert(model, source, inputs, outputs, classifier_config, minimum_deployment_target, convert_to, compute_precision, skip_model_load, compute_units, package_dir, debug, pass_pipeline)
489 if specification_version is None:
490 specification_version = _set_default_specification_version(exact_target)
--> 492 mlmodel = mil_convert(
493 model,
494 convert_from=exact_source,
495 convert_to=exact_target,
496 inputs=inputs,
497 outputs=outputs_as_tensor_or_image_types, # None or list[ct.ImageType/ct.TensorType]
498 classifier_config=classifier_config,
499 skip_model_load=skip_model_load,
500 compute_units=compute_units,
501 package_dir=package_dir,
502 debug=debug,
503 specification_version=specification_version,
504 main_pipeline=pass_pipeline,
505 )
507 if exact_target == 'milinternal':
508 return mlmodel # Returns the MIL program
File ~/miniconda3/envs/prod/lib/python3.10/site-packages/coremltools/converters/mil/converter.py:188, in mil_convert(model, convert_from, convert_to, compute_units, **kwargs)
149 @_profile
150 def mil_convert(
151 model,
(...)
155 **kwargs
156 ):
157 """
158 Convert model from a specified frontend `convert_from` to a specified
159 converter backend `convert_to`.
(...)
186 See `coremltools.converters.convert`
187 """
--> 188 return _mil_convert(model, convert_from, convert_to, ConverterRegistry, MLModel, compute_units, **kwargs)
File ~/miniconda3/envs/prod/lib/python3.10/site-packages/coremltools/converters/mil/converter.py:212, in _mil_convert(model, convert_from, convert_to, registry, modelClass, compute_units, **kwargs)
209 weights_dir = _tempfile.TemporaryDirectory()
210 kwargs["weights_dir"] = weights_dir.name
--> 212 proto, mil_program = mil_convert_to_proto(
213 model,
214 convert_from,
215 convert_to,
216 registry,
217 **kwargs
218 )
220 _reset_conversion_state()
222 if convert_to == 'milinternal':
File ~/miniconda3/envs/prod/lib/python3.10/site-packages/coremltools/converters/mil/converter.py:285, in mil_convert_to_proto(model, convert_from, convert_to, converter_registry, main_pipeline, **kwargs)
280 frontend_pipeline, backend_pipeline = _construct_other_pipelines(
281 main_pipeline, convert_from, convert_to
282 )
284 frontend_converter = frontend_converter_type()
--> 285 prog = frontend_converter(model, **kwargs)
286 PipelineManager.apply_pipeline(prog, frontend_pipeline)
288 PipelineManager.apply_pipeline(prog, main_pipeline)
File ~/miniconda3/envs/prod/lib/python3.10/site-packages/coremltools/converters/mil/converter.py:98, in TensorFlow2Frontend.__call__(self, *args, **kwargs)
95 from .frontend.tensorflow2.load import TF2Loader
97 tf2_loader = TF2Loader(*args, **kwargs)
---> 98 return tf2_loader.load()
File ~/miniconda3/envs/prod/lib/python3.10/site-packages/coremltools/converters/mil/frontend/tensorflow/load.py:82, in TFLoader.load(self)
75 dot_string = self._tf_ssa.get_dot_string(
76 annotation=True, name_and_op_style=True, highlight_debug_nodes=[]
77 )
78 graphviz.Source(dot_string).view(
79 filename="/tmp/ssa_before_tf_passes", cleanup=True
80 )
---> 82 program = self._program_from_tf_ssa()
83 logger.debug("program:\n{}".format(program))
84 return program
File ~/miniconda3/envs/prod/lib/python3.10/site-packages/coremltools/converters/mil/frontend/tensorflow2/load.py:210, in TF2Loader._program_from_tf_ssa(self)
203 self._run_tf_ssa_passes()
204 converter = TF2Converter(
205 tfssa=self._tf_ssa,
206 inputs=self.kwargs["inputs"],
207 outputs=self.kwargs["outputs"],
208 opset_version=self.kwargs["specification_version"],
209 )
--> 210 return converter.convert()
File ~/miniconda3/envs/prod/lib/python3.10/site-packages/coremltools/converters/mil/frontend/tensorflow/converter.py:465, in TFConverter.convert(self)
463 for g_name in self.graph_stack[1:]:
464 self.context.add_graph(g_name, self.tfssa.functions[g_name].graph)
--> 465 self.convert_main_graph(prog, graph)
466 return prog
File ~/miniconda3/envs/prod/lib/python3.10/site-packages/coremltools/converters/mil/frontend/tensorflow/converter.py:389, in TFConverter.convert_main_graph(self, prog, graph)
387 input_var = mb.cast(x=input_var, dtype="fp32", name=name)
388 self.context.add(name, input_var)
--> 389 outputs = convert_graph(self.context, graph, self.output_names)
390 ssa_func.set_outputs(outputs)
391 prog.add_function("main", ssa_func)
File ~/miniconda3/envs/prod/lib/python3.10/site-packages/coremltools/converters/mil/frontend/tensorflow/convert_utils.py:191, in convert_graph(context, graph, outputs)
187 msg = "Conversion for TF op '{0}' not implemented.\n \n{1}".format(
188 node.op, node.original_node
189 )
190 raise NotImplementedError(msg)
--> 191 add_op(context, node)
193 if len(node.outputs) > 0:
194 # set_global / get_global / NoOp has no direct consumer / outputs
195 x = context[node.name]
File ~/miniconda3/envs/prod/lib/python3.10/site-packages/coremltools/converters/mil/frontend/tensorflow/ops.py:555, in Einsum(context, node)
553 a = context[node.inputs[0]]
554 b = context[node.inputs[1]]
--> 555 x = build_einsum_mil(a, b, equation, node.name)
556 context.add(node.name, x)
File ~/miniconda3/envs/prod/lib/python3.10/site-packages/coremltools/converters/mil/frontend/_utils.py:171, in build_einsum_mil(a_var, b_var, equation, name)
169 x = mb.einsum(values=(b_var, a_var), equation=equation_rev, name=name)
170 else:
--> 171 x = solve_generic_einsum(parsed_vectors, a_var, b_var, name)
173 return x
File ~/miniconda3/envs/prod/lib/python3.10/site-packages/coremltools/converters/mil/frontend/_utils.py:331, in solve_generic_einsum(parsed_vectors, a_var, b_var, name)
328 return 1
329 return mb.concat(values=dims, axis=0)
--> 331 parsed_vectors, vars = solve_diagonal_einsum(parsed_vectors, [a_var, b_var])
332 parsed_vectors, vars = solve_sum_einsum(parsed_vectors, vars)
333 a_var, b_var = vars
File ~/miniconda3/envs/prod/lib/python3.10/site-packages/coremltools/converters/mil/frontend/_utils.py:254, in solve_diagonal_einsum(parsed_vectors, vars)
252 for i in range(len(vars)):
253 while len(parsed_vectors[i]) != len(set(parsed_vectors[i])):
--> 254 parsed_vector, var = solve_diagonal_einsum_one_step(parsed_vectors[i], vars[i])
255 parsed_vectors[i] = parsed_vector
256 vars[i] = var
File ~/miniconda3/envs/prod/lib/python3.10/site-packages/coremltools/converters/mil/frontend/_utils.py:246, in solve_diagonal_einsum.<locals>.solve_diagonal_einsum_one_step(parsed_vector, x)
244 indices = mb.range_1d(end=dim_length, start=0, step=1)
245 indices = mb.stack(values=[indices] * len(duplicated_indices), axis=1)
--> 246 x = mb.transpose(x=x, perm=perm)
247 x = mb.gather_nd(x=x, indices=indices)
248 ret_parsed_vector = [parsed_vector[0]] + parsed_vector[len(duplicated_indices):]
File ~/miniconda3/envs/prod/lib/python3.10/site-packages/coremltools/converters/mil/mil/ops/registry.py:182, in SSAOpRegistry.register_op.<locals>.class_wrapper.<locals>.add_op(cls, **kwargs)
179 else:
180 op_cls_to_add = op_reg[op_type]
--> 182 return cls._add_op(op_cls_to_add, **kwargs)
File ~/miniconda3/envs/prod/lib/python3.10/site-packages/coremltools/converters/mil/mil/builder.py:182, in Builder._add_op(cls, op_cls, **kwargs)
180 curr_block()._insert_op_before(new_op, before_op=before_op)
181 new_op.build_nested_blocks()
--> 182 new_op.type_value_inference()
183 if len(new_op.outputs) == 1:
184 return new_op.outputs[0]
File ~/miniconda3/envs/prod/lib/python3.10/site-packages/coremltools/converters/mil/mil/operation.py:253, in Operation.type_value_inference(self, overwrite_output)
243 def type_value_inference(self, overwrite_output=False):
244 """
245 Perform type inference and auto_val computation based on new input Vars
246 in kwargs. If self._output_vars is None then we generate _output_vars;
(...)
251 existing _output_vars
252 """
--> 253 output_types = self.type_inference()
254 if not isinstance(output_types, tuple):
255 output_types = (output_types,)
File ~/miniconda3/envs/prod/lib/python3.10/site-packages/coremltools/converters/mil/mil/ops/defs/iOS15/tensor_transformation.py:956, in transpose.type_inference(self)
954 if len(perm) != self.x.rank:
955 msg = "perm should have the same length as rank(x): {} != {}"
--> 956 raise ValueError(msg.format(len(perm), self.x.rank))
957 if self.x.rank == 0:
958 return self.x.sym_type # scalar cannot be transposed
ValueError: perm should have the same length as rank(x): 5 != 3
🐞Describe the bug
There is an issue when converting TF2 Keras models that contain MultiHeadAttention:
The conversion fails with:
ValueError: Cannot add const [512*is10, 512]
The
is10
variable increments each time I try.The problem seems to be when calculating the matrix size for one of the Einsums. I can't tell if it's the Q,K,V einsums causing trouble or the scaled dot product einsums.
I tried also using the MultiHeadAttention from TensorFlow addons, but that one failed with unsupported einsums.
The model trains and executes fine so this seems to be a conversion issue. I tried 5.0b2.
Trace
To Reproduce
The model source here reproduces this bug: https://github.com/keras-team/keras-io/blob/master/examples/generative/text_generation_with_miniature_gpt.py
System environment: