Closed msis closed 10 months ago
A quick first win is to use .size()
instead of .shape[]
here https://github.com/NVIDIA/NeMo/blob/08937c8e7dd2e782fac99c6c230ae0170c277700/nemo/collections/asr/parts/preprocessing/features.py#L150 :
mask = (
torch.arange(like.size(time_dim), device=like.device)
.repeat(lengths.size(0), 1)
.lt(lengths.view(-1, 1))
)
But then the loops can't pass:
Traceback (most recent call last):
File "/Users/msis/Projects/tarteel/nemo2ios/poc/nemo_tracing_issue.py", line 30, in <module>
model = ct.convert(
^^^^^^^^^^^
File "/Users/msis/.virtualenvs/nemo2ios/lib/python3.11/site-packages/coremltools/converters/_converters_entry.py", line 574, in convert
mlmodel = mil_convert(
^^^^^^^^^^^^
File "/Users/msis/.virtualenvs/nemo2ios/lib/python3.11/site-packages/coremltools/converters/mil/converter.py", line 188, in mil_convert
return _mil_convert(model, convert_from, convert_to, ConverterRegistry, MLModel, compute_units, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/msis/.virtualenvs/nemo2ios/lib/python3.11/site-packages/coremltools/converters/mil/converter.py", line 212, in _mil_convert
proto, mil_program = mil_convert_to_proto(
^^^^^^^^^^^^^^^^^^^^^
File "/Users/msis/.virtualenvs/nemo2ios/lib/python3.11/site-packages/coremltools/converters/mil/converter.py", line 286, in mil_convert_to_proto
prog = frontend_converter(model, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/msis/.virtualenvs/nemo2ios/lib/python3.11/site-packages/coremltools/converters/mil/converter.py", line 108, in __call__
return load(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/Users/msis/.virtualenvs/nemo2ios/lib/python3.11/site-packages/coremltools/converters/mil/frontend/torch/load.py", line 80, in load
return _perform_torch_convert(converter, debug)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/msis/.virtualenvs/nemo2ios/lib/python3.11/site-packages/coremltools/converters/mil/frontend/torch/load.py", line 99, in _perform_torch_convert
prog = converter.convert()
^^^^^^^^^^^^^^^^^^^
File "/Users/msis/.virtualenvs/nemo2ios/lib/python3.11/site-packages/coremltools/converters/mil/frontend/torch/converter.py", line 519, in convert
convert_nodes(self.context, self.graph)
File "/Users/msis/.virtualenvs/nemo2ios/lib/python3.11/site-packages/coremltools/converters/mil/frontend/torch/ops.py", line 88, in convert_nodes
add_op(context, node)
File "/Users/msis/.virtualenvs/nemo2ios/lib/python3.11/site-packages/coremltools/converters/mil/frontend/torch/ops.py", line 3272, in loop
loop = mb.while_loop(
^^^^^^^^^^^^^^
File "/Users/msis/.virtualenvs/nemo2ios/lib/python3.11/site-packages/coremltools/converters/mil/mil/ops/registry.py", line 182, in add_op
return cls._add_op(op_cls_to_add, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/msis/.virtualenvs/nemo2ios/lib/python3.11/site-packages/coremltools/converters/mil/mil/builder.py", line 183, in _add_op
new_op.build_nested_blocks()
File "/Users/msis/.virtualenvs/nemo2ios/lib/python3.11/site-packages/coremltools/converters/mil/mil/ops/defs/iOS15/control_flow.py", line 452, in build_nested_blocks
raise ValueError(msg.format(
ValueError: loop_vars 'mask.1_x0' changes in the body of while_loop 'mask.45':
<class 'coremltools.converters.mil.mil.types.type_tensor.tensor.<locals>.tensor'> -> <class 'coremltools.converters.mil.mil.types.type_tensor.tensor.<locals>.tensor'>
There a few more hurdles:
coreml
does not support the logical notcoreml
does not support bool type and will cast to fp32Given that it seems like these are coreml
short-comings, I'm not sure the code should be changed (I wrote this function to be torch.jit
-friendly and reasonably generalized).
With some assumptions, these ad hoc patches export with coreml
(though I didn't check for correctness, since I am unfamiliar with coreml
):
@torch.jit.script_if_tracing
def make_seq_mask_like(
lengths: torch.Tensor, like: torch.Tensor, time_dim: int = -1, valid_ones: bool = True
) -> torch.Tensor:
# use `ge` instead of `lt` if `valid_ones=False`
mask = torch.arange(like.size(time_dim), device=like.device).repeat(lengths.size(0), 1).ge(lengths.view(-1, 1))
# unroll loop [B, T] -> [B, 1, T]
mask = mask.unsqueeze(1)
# for _ in range(like.dim() - mask.dim()):
# mask = mask.unsqueeze(1)
# RuntimeError: PyTorch convert function for op '__not__' not implemented.
# we addressed this with `ge` vs `lt`
# if not valid_ones:
# mask = ~mask
# coreml doesn't support bool output
mask = mask.int()
return mask
My goal is to export a NeMo model to Core ML. That function was the last issue I had. With the new modification, I can now export to Core ML without casting to bool:
@torch.jit.script_if_tracing
def make_seq_mask_like(
lengths: torch.Tensor, like: torch.Tensor, time_dim: int = -1, valid_ones: bool = True
) -> torch.Tensor:
# use `ge` instead of `lt` if `valid_ones=False`
mask = torch.arange(like.size(time_dim), device=like.device).repeat(lengths.size(0), 1).ge(lengths.view(-1, 1))
# unroll loop [B, T] -> [B, 1, T]
mask = mask.unsqueeze(1)
# for _ in range(like.dim() - mask.dim()):
# mask = mask.unsqueeze(1)
# RuntimeError: PyTorch convert function for op '__not__' not implemented.
# we addressed this with `ge` vs `lt`
# if not valid_ones:
# mask = ~mask
# # coreml doesn't support bool output
# mask = mask.int()
return mask
This still works well with torch.jit
and I think is just as generalized if no casting is done.
So can it be updated?
It's really not my call, but I think the function has lost generality and should be used only for this purpose. E.g., the unrolled loop assumes like
's shape is [B, D, T]
, valid_ones
is ignored, and the output dtype is unintuitive.
If you grep this codebase for something like egrep '^\s*def .*mask(_|\()'
you'll see about 10-20 different functions that create a similar mask in an ad hoc way... my intent was a reusable one.
Given that these are basic and common operations, I think you'd be doing the coreml converter developers a favor by opening an issue with this function as an MWE.
Core ML only supports tracing and not scripting... See the issue I linked above.
What other shapes does like
take?
We can add conditions for valid_ones
, no?
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.
?
E.g., like
takes a 4-dimensional shape (B, M, F, N)
here
https://github.com/NVIDIA/NeMo/blob/4fd1c74299d5c2c06c8b04e7a64c83fd9ab5a5da/nemo/collections/asr/modules/audio_modules.py#L461-L463
I don't even know what the dimensions (B, M, F, N)
are, but it works, which is the benefit of keeping it so general.
Conditions for valid_ones
already exist in the original implementation... it's the coreml converter that cannot capture the behavior. This behavior is used, e.g., here to invert the default value: https://github.com/NVIDIA/NeMo/blob/4fd1c74299d5c2c06c8b04e7a64c83fd9ab5a5da/nemo/collections/asr/losses/audio_losses.py#L56
I've verified that the original code works when exported to ONNX:
So it works for JIT and ONNX, just not the coreml converter (which I think is immature).
I still think it should be locally patched for the coreml converter for now, until they support more operators. @titu1994 would make the final call, I'm just a random guy who contributes for fun, not a maintainer.
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.
This issue was closed because it has been inactive for 7 days since being marked as stale.
Describe the bug
The asr preprocessors cannot be exported to Core ML because of this function:
make_seq_mask_like
.Any chance for a rewrite that allows the export?
Steps/Code to reproduce bug
Here I singled the function causing issues:
Here's the traceback of the error:
Expected behavior
Ideally, this should just work:
Environment overview (please complete the following information)
pip install nemo_toolkit[asr]
Environment details
If NVIDIA docker image is used you don't need to specify these. Otherwise, please provide: