NVIDIA / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
https://docs.nvidia.com/nemo-framework/user-guide/latest/overview.html
Apache License 2.0
12.31k stars 2.55k forks source link

Preprocessor cannot be exported to Core ML #7921

Closed msis closed 10 months ago

msis commented 1 year ago

Describe the bug

The asr preprocessors cannot be exported to Core ML because of this function: make_seq_mask_like.

Any chance for a rewrite that allows the export?

Steps/Code to reproduce bug

Here I singled the function causing issues:

import torch
import numpy as np
import coremltools as ct
from nemo.collections.asr.parts.preprocessing.features import make_seq_mask_like

_lengths = torch.load("lengths.pt")
_like = torch.load("like.pt")
_time_dim = torch.load("time_dim.pt")
_valid_ones = torch.load("valid_ones.pt")
_mask = torch.load("mask.pt")

mask = make_seq_mask_like(_lengths, _like, _time_dim, _valid_ones)

print(torch.all(torch.eq(mask, _mask)))

class SeqMaskLike(torch.nn.Module):
    def __init__(self):
        super().__init__()

    def forward(self, lengths, like):
        return make_seq_mask_like(lengths, like, _time_dim, _valid_ones)

model = SeqMaskLike()
model.eval()
traced_model = torch.jit.trace(model, [_lengths, _like])

# Convert to CoreML
model = ct.convert(
    traced_model,
    inputs=[
        ct.TensorType(name="lengths", shape=_lengths.shape),
        ct.TensorType(name="like", shape=_like.shape),
    ],
    outputs=[ct.TensorType(name="mask")],
    source="pytorch",
)

Here's the traceback of the error:

Traceback (most recent call last):
  File "/Users/msis/Projects/tarteel/nemo2ios/poc/nemo_tracing_issue.py", line 30, in <module>
    model = ct.convert(
            ^^^^^^^^^^^
  File "/Users/msis/.virtualenvs/nemo2ios/lib/python3.11/site-packages/coremltools/converters/_converters_entry.py", line 574, in convert
    mlmodel = mil_convert(
              ^^^^^^^^^^^^
  File "/Users/msis/.virtualenvs/nemo2ios/lib/python3.11/site-packages/coremltools/converters/mil/converter.py", line 188, in mil_convert
    return _mil_convert(model, convert_from, convert_to, ConverterRegistry, MLModel, compute_units, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/msis/.virtualenvs/nemo2ios/lib/python3.11/site-packages/coremltools/converters/mil/converter.py", line 212, in _mil_convert
    proto, mil_program = mil_convert_to_proto(
                         ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/msis/.virtualenvs/nemo2ios/lib/python3.11/site-packages/coremltools/converters/mil/converter.py", line 286, in mil_convert_to_proto
    prog = frontend_converter(model, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/msis/.virtualenvs/nemo2ios/lib/python3.11/site-packages/coremltools/converters/mil/converter.py", line 108, in __call__
    return load(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/msis/.virtualenvs/nemo2ios/lib/python3.11/site-packages/coremltools/converters/mil/frontend/torch/load.py", line 80, in load
    return _perform_torch_convert(converter, debug)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/msis/.virtualenvs/nemo2ios/lib/python3.11/site-packages/coremltools/converters/mil/frontend/torch/load.py", line 99, in _perform_torch_convert
    prog = converter.convert()
           ^^^^^^^^^^^^^^^^^^^
  File "/Users/msis/.virtualenvs/nemo2ios/lib/python3.11/site-packages/coremltools/converters/mil/frontend/torch/converter.py", line 519, in convert
    convert_nodes(self.context, self.graph)
  File "/Users/msis/.virtualenvs/nemo2ios/lib/python3.11/site-packages/coremltools/converters/mil/frontend/torch/ops.py", line 88, in convert_nodes
    add_op(context, node)
  File "/Users/msis/.virtualenvs/nemo2ios/lib/python3.11/site-packages/coremltools/converters/mil/frontend/torch/ops.py", line 3397, in getitem
    raise AssertionError("Item selection is supported only on python list/tuple objects")
AssertionError: Item selection is supported only on python list/tuple objects

Expected behavior

Ideally, this should just work:

import torch
from nemo.collections.asr.models import EncDecCTCModelBPE
import coremltools as ct

model = EncDecCTCModelBPE.from_pretrained(model_name="stt_en_conformer_ctc_large")

model.preprocessor.export("preprocessor.pt")

ts_model_preprocessor = torch.jit.load("preprocessor.pt")

input, input_length = model.preprocessor.input_example(max_batch=2) # 1 is the batch size

traced_processor = torch.jit.trace(model, [input, input_length])

ct_preprocessor = ct.convert(
    traced_preprocessor,
    inputs=[
        ct.TensorType(name="input", shape=input.shape),
        ct.TensorType(name="input_length", shape=input_length.shape),
    ],
)

Environment overview (please complete the following information)

Environment details

If NVIDIA docker image is used you don't need to specify these. Otherwise, please provide:

msis commented 1 year ago

A quick first win is to use .size() instead of .shape[] here https://github.com/NVIDIA/NeMo/blob/08937c8e7dd2e782fac99c6c230ae0170c277700/nemo/collections/asr/parts/preprocessing/features.py#L150 :

    mask = (
        torch.arange(like.size(time_dim), device=like.device)
        .repeat(lengths.size(0), 1)
        .lt(lengths.view(-1, 1))
    )

But then the loops can't pass:

Traceback (most recent call last):
  File "/Users/msis/Projects/tarteel/nemo2ios/poc/nemo_tracing_issue.py", line 30, in <module>
    model = ct.convert(
            ^^^^^^^^^^^
  File "/Users/msis/.virtualenvs/nemo2ios/lib/python3.11/site-packages/coremltools/converters/_converters_entry.py", line 574, in convert
    mlmodel = mil_convert(
              ^^^^^^^^^^^^
  File "/Users/msis/.virtualenvs/nemo2ios/lib/python3.11/site-packages/coremltools/converters/mil/converter.py", line 188, in mil_convert
    return _mil_convert(model, convert_from, convert_to, ConverterRegistry, MLModel, compute_units, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/msis/.virtualenvs/nemo2ios/lib/python3.11/site-packages/coremltools/converters/mil/converter.py", line 212, in _mil_convert
    proto, mil_program = mil_convert_to_proto(
                         ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/msis/.virtualenvs/nemo2ios/lib/python3.11/site-packages/coremltools/converters/mil/converter.py", line 286, in mil_convert_to_proto
    prog = frontend_converter(model, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/msis/.virtualenvs/nemo2ios/lib/python3.11/site-packages/coremltools/converters/mil/converter.py", line 108, in __call__
    return load(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/msis/.virtualenvs/nemo2ios/lib/python3.11/site-packages/coremltools/converters/mil/frontend/torch/load.py", line 80, in load
    return _perform_torch_convert(converter, debug)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/msis/.virtualenvs/nemo2ios/lib/python3.11/site-packages/coremltools/converters/mil/frontend/torch/load.py", line 99, in _perform_torch_convert
    prog = converter.convert()
           ^^^^^^^^^^^^^^^^^^^
  File "/Users/msis/.virtualenvs/nemo2ios/lib/python3.11/site-packages/coremltools/converters/mil/frontend/torch/converter.py", line 519, in convert
    convert_nodes(self.context, self.graph)
  File "/Users/msis/.virtualenvs/nemo2ios/lib/python3.11/site-packages/coremltools/converters/mil/frontend/torch/ops.py", line 88, in convert_nodes
    add_op(context, node)
  File "/Users/msis/.virtualenvs/nemo2ios/lib/python3.11/site-packages/coremltools/converters/mil/frontend/torch/ops.py", line 3272, in loop
    loop = mb.while_loop(
           ^^^^^^^^^^^^^^
  File "/Users/msis/.virtualenvs/nemo2ios/lib/python3.11/site-packages/coremltools/converters/mil/mil/ops/registry.py", line 182, in add_op
    return cls._add_op(op_cls_to_add, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/msis/.virtualenvs/nemo2ios/lib/python3.11/site-packages/coremltools/converters/mil/mil/builder.py", line 183, in _add_op
    new_op.build_nested_blocks()
  File "/Users/msis/.virtualenvs/nemo2ios/lib/python3.11/site-packages/coremltools/converters/mil/mil/ops/defs/iOS15/control_flow.py", line 452, in build_nested_blocks
    raise ValueError(msg.format(
ValueError: loop_vars 'mask.1_x0' changes in the body of while_loop 'mask.45':
 <class 'coremltools.converters.mil.mil.types.type_tensor.tensor.<locals>.tensor'> -> <class 'coremltools.converters.mil.mil.types.type_tensor.tensor.<locals>.tensor'>
msis commented 1 year ago

7921.zip A zip of the tensor used in the example.

1-800-BAD-CODE commented 1 year ago

There a few more hurdles:

  1. Need to manually unroll loop
  2. coreml does not support the logical not
  3. coreml does not support bool type and will cast to fp32

Given that it seems like these are coreml short-comings, I'm not sure the code should be changed (I wrote this function to be torch.jit-friendly and reasonably generalized).

With some assumptions, these ad hoc patches export with coreml (though I didn't check for correctness, since I am unfamiliar with coreml):

@torch.jit.script_if_tracing
def make_seq_mask_like(
    lengths: torch.Tensor, like: torch.Tensor, time_dim: int = -1, valid_ones: bool = True
) -> torch.Tensor:
    # use `ge` instead of `lt` if `valid_ones=False`
    mask = torch.arange(like.size(time_dim), device=like.device).repeat(lengths.size(0), 1).ge(lengths.view(-1, 1))

    # unroll loop [B, T] -> [B, 1, T] 
    mask = mask.unsqueeze(1)
    # for _ in range(like.dim() - mask.dim()):
    #     mask = mask.unsqueeze(1)

    # RuntimeError: PyTorch convert function for op '__not__' not implemented.
    # we addressed this with `ge` vs `lt`
    # if not valid_ones:
    #     mask = ~mask

    # coreml doesn't support bool output
    mask = mask.int()
    return mask
msis commented 1 year ago

My goal is to export a NeMo model to Core ML. That function was the last issue I had. With the new modification, I can now export to Core ML without casting to bool:

@torch.jit.script_if_tracing
def make_seq_mask_like(
    lengths: torch.Tensor, like: torch.Tensor, time_dim: int = -1, valid_ones: bool = True
) -> torch.Tensor:
    # use `ge` instead of `lt` if `valid_ones=False`
    mask = torch.arange(like.size(time_dim), device=like.device).repeat(lengths.size(0), 1).ge(lengths.view(-1, 1))

    # unroll loop [B, T] -> [B, 1, T] 
    mask = mask.unsqueeze(1)
    # for _ in range(like.dim() - mask.dim()):
    #     mask = mask.unsqueeze(1)

    # RuntimeError: PyTorch convert function for op '__not__' not implemented.
    # we addressed this with `ge` vs `lt`
    # if not valid_ones:
    #     mask = ~mask

    # # coreml doesn't support bool output
    # mask = mask.int()
    return mask

This still works well with torch.jit and I think is just as generalized if no casting is done. So can it be updated?

1-800-BAD-CODE commented 1 year ago

It's really not my call, but I think the function has lost generality and should be used only for this purpose. E.g., the unrolled loop assumes like's shape is [B, D, T], valid_ones is ignored, and the output dtype is unintuitive.

If you grep this codebase for something like egrep '^\s*def .*mask(_|\()' you'll see about 10-20 different functions that create a similar mask in an ad hoc way... my intent was a reusable one.

Given that these are basic and common operations, I think you'd be doing the coreml converter developers a favor by opening an issue with this function as an MWE.

msis commented 1 year ago

Core ML only supports tracing and not scripting... See the issue I linked above.

What other shapes does like take?

We can add conditions for valid_ones, no?

github-actions[bot] commented 11 months ago

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

msis commented 11 months ago

?

1-800-BAD-CODE commented 11 months ago

E.g., like takes a 4-dimensional shape (B, M, F, N) here https://github.com/NVIDIA/NeMo/blob/4fd1c74299d5c2c06c8b04e7a64c83fd9ab5a5da/nemo/collections/asr/modules/audio_modules.py#L461-L463 I don't even know what the dimensions (B, M, F, N) are, but it works, which is the benefit of keeping it so general.

Conditions for valid_ones already exist in the original implementation... it's the coreml converter that cannot capture the behavior. This behavior is used, e.g., here to invert the default value: https://github.com/NVIDIA/NeMo/blob/4fd1c74299d5c2c06c8b04e7a64c83fd9ab5a5da/nemo/collections/asr/losses/audio_losses.py#L56

I've verified that the original code works when exported to ONNX:

ONNX test ```python torch.onnx.export( model=model, args=(lengths, like), f="tmp.onnx", input_names=["lengths", "like"], opset_version=17, dynamic_axes={"lengths": [0], "like": [0, 1, 2]}, ) session = onnxruntime.InferenceSession("tmp.onnx") outputs = session.run( None, {"lengths": np.array([5, 3, 1]).astype(np.float32), "like": np.random.random(size=[3, 2, 5]).astype(np.float32)}, ) print(outputs[0]) # output: # [[[ True True True True True]] # [[ True True True False False]] # [[ True False False False False]]] ```

So it works for JIT and ONNX, just not the coreml converter (which I think is immature).

I still think it should be locally patched for the coreml converter for now, until they support more operators. @titu1994 would make the final call, I'm just a random guy who contributes for fun, not a maintainer.

github-actions[bot] commented 10 months ago

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

github-actions[bot] commented 10 months ago

This issue was closed because it has been inactive for 7 days since being marked as stale.