Quantization of yolov5 for XNNPACK backend

EarthMu commented 10 months ago

Hello, I want to quantize yolov5 model for XNNPACK backend on Intel CPU.

So far, I confirmed following:

Exporting resnet18 into xnnpack-exported-resnet18.pte for XNNPACK backend.
Running xnnpack-resnet18.pte
Exporting resnet18 into xnnpack-exported-quantized-resnet18.pte with quantization.
Running xnnpack-exported-quantized-resnet18.pte and confirming it is faster than xnnpack-resnet18.pte.
Exporting yolov5s.pt into xnnpack-yolov5s.pte for XNNPACK backend.
Running xnnpack-yolov5s.pte
Exporting yolov5s.pt into xnnpack-exported-quantized-yolov5s.pte with quantization.
When running xnnpack-exported-quantized-yolov5s.pte following error happened.

So I could run quantized-resnet18 and exported-yolov5s. But I couldn't run quantized-yolov5s. I want to know following things:

How can I successfully run quantized-yolov5s on Intel CPU?
What is the cause of the error?
Are inference speed of exported-yolov5s/exported-resnet18/exported-exported-quantized-resnet18 expected? Can I make it more faster?

Error

I 00:00:00.095980 executorch:executor_runner.cpp:138] Model file ./xnnpack-exported-quantized-yolov5s.pte is loaded.
I 00:00:00.096002 executorch:executor_runner.cpp:147] Using method forward
I 00:00:00.096006 executorch:executor_runner.cpp:195] Setting up planned buffer 0, size 65547280.
I 00:00:00.157690 executorch:executor_runner.cpp:199] Allocating planned_memory
I 00:00:00.157709 executorch:executor_runner.cpp:213] Loading method in program
E 00:00:00.191145 executorch:XNNCompiler.cpp:427] Failed to create multiply node 800 with code: xnn_status_invalid_parameter
E 00:00:00.191248 executorch:XNNPACKBackend.cpp:46] XNNCompiler::compleModel failed: 0x1
E 00:00:00.191360 executorch:XNNCompiler.cpp:427] Failed to create multiply node 824 with code: xnn_status_invalid_parameter
E 00:00:00.191402 executorch:XNNPACKBackend.cpp:46] XNNCompiler::compleModel failed: 0x1
I 00:00:00.191721 executorch:executor_runner.cpp:220] Method loaded.
I 00:00:00.192612 executorch:executor_runner.cpp:239] Inputs prepared.
E 00:00:01.029698 executorch:XNNPACKBackend.cpp:64] External id and expected delegate args mismatch
E 00:00:01.029722 executorch:method.cpp:962] CALL_DELEGATE execute failed at instruction 76: 0x1
F 00:00:01.029727 executorch:executor_runner.cpp:241] In function main(), assert failed (status == Error::Ok): Execution of method forward failed with status 0x1
Aborted (core dumped)

Environment

https://github.com/pytorch/executorch/issues/1313#issue-2017895036

Inference speed on resnet18

Non-Quantized resnet18

32.85ms per image

I 00:00:01.515202 executorch:executor_runner.cpp:239] Inputs prepared.
I 00:00:01.548059 executorch:executor_runner.cpp:246] Model executed successfully.

Quantized resnet18

22.56ms per image

I 00:00:01.653248 executorch:executor_runner.cpp:239] Inputs prepared.
I 00:00:01.675816 executorch:executor_runner.cpp:246] Model executed successfully.

Running yolov5.pte model for XNNPACK backends

Exporting code

def load_yolov5():
.
.
if __name__ == '__main__':
    from torch._export import capture_pre_autograd_graph
    from torch.export import export, ExportedProgram
    import executorch.exir as exir
    from executorch.exir import to_edge
    from executorch.backends.xnnpack.partition.xnnpack_partitioner import XnnpackPartitioner
    from executorch.exir import ExecutorchBackendConfig, ExecutorchProgramManager

    ## Loading model
    yolov5 = load_yolov5()
    ## export to exir
    example_args = (torch.randn(1, 3, 640, 640), )
    pre_autograd_aten_dialect = capture_pre_autograd_graph(yolov5, example_args)
    ## export to aten dialect
    aten_dialect: ExportedProgram = export(pre_autograd_aten_dialect, example_args)
    ## export to edge
    edge_program: exir.EdgeProgramManager = to_edge(aten_dialect)
    edge_program: exir.EdgeProgramManager = edge_program.to_backend(XnnpackPartitioner)
    ## export to executorch
    executorch_program: exir.ExecutorchProgramManager = edge_program.to_executorch(
        ExecutorchBackendConfig(
            passes=[],  # User-defined passes
        )
    )
    ## save pte model
    with open("xnnpack-exported-yolov5s.pte", "wb") as file:
        file.write(executorch_program.buffer)

Inference log

$ ./cmake-out/backends/xnnpack/xnn_executor_runner --model_path ./xnnpack-exported-yolov5s.pte --img_path test.jpg 
Number of arguments: 5
Argument 0: ./cmake-out/backends/xnnpack/xnn_executor_runner
Argument 1: --model_path
Argument 2: ./xnnpack-exported-yolov5s.pte
Argument 3: --img_path
Argument 4: test.jpg
I 00:00:00.098044 executorch:executor_runner.cpp:138] Model file ./xnnpack-exported-yolov5s.pte is loaded.
I 00:00:00.098070 executorch:executor_runner.cpp:147] Using method forward
I 00:00:00.098076 executorch:executor_runner.cpp:195] Setting up planned buffer 0, size 67190464.
I 00:00:00.160809 executorch:executor_runner.cpp:199] Allocating planned_memory
I 00:00:00.160826 executorch:executor_runner.cpp:213] Loading method in program
I 00:00:00.190032 executorch:executor_runner.cpp:220] Method loaded.
I 00:00:00.190967 executorch:executor_runner.cpp:239] Inputs prepared.
I 00:00:01.079164 executorch:executor_runner.cpp:246] Model executed successfully.

Running quantized yolov5.pte model for XNNPACK backends

Exporting code

def load_yolov5():
.
.
if __name__ == '__main__':
    from torch._export import capture_pre_autograd_graph
    from torch.export import export, ExportedProgram
    import executorch.exir as exir
    from executorch.exir import to_edge
    from executorch.backends.xnnpack.partition.xnnpack_partitioner import XnnpackPartitioner
    from executorch.exir import ExecutorchBackendConfig, ExecutorchProgramManager
    ## For Quantization
    from torch.ao.quantization.quantize_pt2e import convert_pt2e, prepare_pt2e
    from torch.ao.quantization.quantizer.xnnpack_quantizer import (
        get_symmetric_quantization_config,
        XNNPACKQuantizer,
    )
    from executorch.exir import EdgeCompileConfig

    ## Loading model
    yolov5 = load_yolov5()
    ## export to exir
    example_args = (torch.randn(1, 3, 640, 640), )
    pre_autograd_aten_dialect = capture_pre_autograd_graph(yolov5, example_args)

    ## Quantizing
    quantizer = XNNPACKQuantizer().set_global(get_symmetric_quantization_config())
    prepared_graph = prepare_pt2e(pre_autograd_aten_dialect, quantizer)
    converted_graph = convert_pt2e(prepared_graph)

    ## export to aten dialect
    aten_dialect: ExportedProgram = export(converted_graph, example_args)
    ## export to edge
    edge_program: exir.EdgeProgramManager = to_edge(aten_dialect, compile_config=EdgeCompileConfig(_check_ir_validity=False))
    edge_program: exir.EdgeProgramManager = edge_program.to_backend(XnnpackPartitioner)
    ## export to executorch
    executorch_program: exir.ExecutorchProgramManager = edge_program.to_executorch(
        ExecutorchBackendConfig(
            passes=[],  # User-defined passes
        )
    )
    ## save pte model
    with open("xnnpack-exported-quantized-yolov5s.pte", "wb") as file:
        file.write(executorch_program.buffer)

Inference log(Error)

$ ./cmake-out/backends/xnnpack/xnn_executor_runner --model_path ./xnnpack-exported-quantized-yolov5s.pte --img_path test.jpg 
Number of arguments: 5
Argument 0: ./cmake-out/backends/xnnpack/xnn_executor_runner
Argument 1: --model_path
Argument 2: ./xnnpack-exported-quantized-yolov5s.pte
Argument 3: --img_path
Argument 4: test.jpg
I 00:00:00.095980 executorch:executor_runner.cpp:138] Model file ./xnnpack-exported-quantized-yolov5s.pte is loaded.
I 00:00:00.096002 executorch:executor_runner.cpp:147] Using method forward
I 00:00:00.096006 executorch:executor_runner.cpp:195] Setting up planned buffer 0, size 65547280.
I 00:00:00.157690 executorch:executor_runner.cpp:199] Allocating planned_memory
I 00:00:00.157709 executorch:executor_runner.cpp:213] Loading method in program
E 00:00:00.191145 executorch:XNNCompiler.cpp:427] Failed to create multiply node 800 with code: xnn_status_invalid_parameter
E 00:00:00.191248 executorch:XNNPACKBackend.cpp:46] XNNCompiler::compleModel failed: 0x1
E 00:00:00.191360 executorch:XNNCompiler.cpp:427] Failed to create multiply node 824 with code: xnn_status_invalid_parameter
E 00:00:00.191402 executorch:XNNPACKBackend.cpp:46] XNNCompiler::compleModel failed: 0x1
I 00:00:00.191721 executorch:executor_runner.cpp:220] Method loaded.
I 00:00:00.192612 executorch:executor_runner.cpp:239] Inputs prepared.
E 00:00:01.029698 executorch:XNNPACKBackend.cpp:64] External id and expected delegate args mismatch
E 00:00:01.029722 executorch:method.cpp:962] CALL_DELEGATE execute failed at instruction 76: 0x1
F 00:00:01.029727 executorch:executor_runner.cpp:241] In function main(), assert failed (status == Error::Ok): Execution of method forward failed with status 0x1
Aborted (core dumped)

Thanks,

cccclai commented 10 months ago

cc: @digantdesai, @mcr229

sriomsubham commented 10 months ago

Hi @EarthMu ,
can you please share the code of 'def load_yolov5():'

Thanks,

mcr229 commented 10 months ago

Hi @EarthMu,

for your first two questions, it is a little difficult for us to tell what exactly is going wrong here. Would it be possible to either share your model so we can repro on our end? Also if you set the XNNPACK Library to Debug mode here:

https://github.com/pytorch/executorch/blob/main/backends/xnnpack/cmake/Dependencies.cmake

We might be able to get some better logs as to what exactly is failing.

As for your third question on about quantized resnet speed, Can you share how you're benchmarking, what the numbers you are getting, and what speeds you are expecting or wanting?

ali-khosh commented 9 months ago

Hey @EarthMu. I'm the ExecuTorch product manager with the PyTorch team. Thanks for your interest in trying and contributing to ExecuTorch! I noticed you've been filing issues and interacting with the team and was wondering if you're interested in having an informal conversation so we learn more about your use case, wish list and existing pain points. Please feel free to email me at khosh@meta.com. Thanks. Ali.

khimanir commented 8 months ago

@EarthMu Could you please share your code on how you got yolo_v5 loaded and converted to a PTE file? I'm failing at the capture_pre_autograd_graph step and would like to see how you loaded the model.

Thanks

EarthMu commented 8 months ago

Hi @khimanir This thread is that I used for converting yolov5 model to PTE file.
Before converting the model to PTE format, you need to check following things:

yolov5 model was successfully loaded.

example_args is match with yolov5's shape.

## Loading model
yolov5 = load_yolov5()
## export to exir
example_args = (torch.randn(1, 3, 640, 640), )
pre_autograd_aten_dialect = capture_pre_autograd_graph(yolov5, example_args)

Or you can share the error log with us to help you more.

Thanks

sriomsubham commented 8 months ago

There is one way which worked for me. I wrote the yolov5 sructure in pytorch format from scratch (i didnot use .yaml file for structure or any extra file form yolov5 repo) and i was able to convert it to .pte.

khimanir commented 8 months ago

Hi @khimanir This thread is that I used for converting yolov5 model to PTE file. Before converting the model to PTE format, you need to check following things:

yolov5 model was successfully loaded.

example_args is match with yolov5's shape.
    ## Loading model
    yolov5 = load_yolov5()
    ## export to exir
    example_args = (torch.randn(1, 3, 640, 640), )
    pre_autograd_aten_dialect = capture_pre_autograd_graph(yolov5, example_args)
Or you can share the error log with us to help you more.

Thanks

import logging
from ultralytics import YOLO
import torch

from torchvision import models

from ..model_base import EagerModelBase

class YV5Model(EagerModelBase):
    def __init__(self):
        pass

    def get_eager_model(self) -> torch.nn.Module:
        logging.info("Loading yolo_v5 model")
        model = torch.hub.load('ultralytics/yolov5', 'yolov5s')  # load a pretrained model (recommended for training)
        logging.info("Loaded yolo_v5 model")
        im = 'https://ultralytics.com/images/zidane.jpg'
        results = model(im)
        results.print()

        return model

    def get_example_inputs(self):
        tensor_size = (1, 3, 640, 640)
        return (torch.randn(tensor_size),)

@EarthMu @sriomsubham

This is the way I tried to load it.

ran with the following line:

python3 -m examples.portable.scripts.export --model_name="yv5"

and received:

(executorch2) vboxuser@Pytorch:~/executorch$ python3 -m examples.portable.scripts.export --model_name="yv5"
[INFO 2024-01-31 09:25:37,771 model.py:21] Loading yolo_v5 model
Using cache found in /home/vboxuser/.cache/torch/hub/ultralytics_yolov5_master
YOLOv5 🚀 2023-12-19 Python-3.10.13 torch-2.3.0.dev20240123+cpu CPU

Fusing layers... 
[W NNPACK.cpp:61] Could not initialize NNPACK! Reason: Unsupported hardware.
YOLOv5s summary: 213 layers, 7225885 parameters, 0 gradients, 16.4 GFLOPs
Adding AutoShape... 
[INFO 2024-01-31 09:25:39,377 model.py:23] Loaded yolo_v5 model
image 1/1: 720x1280 2 persons, 2 ties
Speed: 821.8ms pre-process, 469.2ms inference, 15.9ms NMS per image at shape (1, 3, 384, 640)
Traceback (most recent call last):
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/vboxuser/executorch/examples/portable/scripts/export.py", line 65, in <module>
    main()  # pragma: no cover
  File "/home/vboxuser/executorch/examples/portable/scripts/export.py", line 60, in main
    prog = export_to_exec_prog(model, example_inputs, dynamic_shapes=dynamic_shapes)
  File "/home/vboxuser/executorch/examples/portable/utils.py", line 82, in export_to_exec_prog
    m = capture_pre_autograd_graph(m, example_inputs)
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_export/__init__.py", line 161, in capture_pre_autograd_graph
    m = torch._dynamo.export(
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 1288, in inner
    result_traced = opt_f(*args, **kwargs)
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 417, in _fn
    return fn(*args, **kwargs)
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 580, in catch_errors
    return callback(frame, cache_entry, hooks, frame_state)
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 386, in _convert_frame_assert
    return _compile(
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 645, in _compile
    guarded_code = compile_inner(code, one_graph, hooks, transform)
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/utils.py", line 248, in time_wrapper
    r = func(*args, **kwargs)
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 526, in compile_inner
    out_code = transform_code_object(code, transform)
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/bytecode_transformation.py", line 1033, in transform_code_object
    transformations(instructions, code_options)
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 151, in _fn
    return fn(*args, **kwargs)
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 491, in transform
    tracer.run()
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2122, in run
    super().run()
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 787, in run
    and self.step()
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 750, in step
    getattr(self, inst.opname)(inst)
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 469, in wrapper
    return inner_fn(self, inst)
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1237, in CALL_FUNCTION_EX
    self.call_function(fn, argsvars.items, kwargsvars)
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 651, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/variables/nn_module.py", line 332, in call_function
    return tx.inline_user_function_return(
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 657, in inline_user_function_return
    return InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2257, in inline_call
    return cls.inline_call_(parent, func, args, kwargs)
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2366, in inline_call_
    tracer.run()
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 787, in run
    and self.step()
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 750, in step
    getattr(self, inst.opname)(inst)
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 469, in wrapper
    return inner_fn(self, inst)
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1237, in CALL_FUNCTION_EX
    self.call_function(fn, argsvars.items, kwargsvars)
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 651, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 322, in call_function
    return super().call_function(tx, args, kwargs)
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 276, in call_function
    return super().call_function(tx, args, kwargs)
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 84, in call_function
    return tx.inline_user_function_return(
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 657, in inline_user_function_return
    return InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2257, in inline_call
    return cls.inline_call_(parent, func, args, kwargs)
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2366, in inline_call_
    tracer.run()
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 787, in run
    and self.step()
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 750, in step
    getattr(self, inst.opname)(inst)
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 469, in wrapper
    return inner_fn(self, inst)
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1237, in CALL_FUNCTION_EX
    self.call_function(fn, argsvars.items, kwargsvars)
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 651, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 276, in call_function
    return super().call_function(tx, args, kwargs)
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 84, in call_function
    return tx.inline_user_function_return(
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 657, in inline_user_function_return
    return InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2257, in inline_call
    return cls.inline_call_(parent, func, args, kwargs)
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2366, in inline_call_
    tracer.run()
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 787, in run
    and self.step()
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 750, in step
    getattr(self, inst.opname)(inst)
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1073, in SETUP_WITH
    self.setup_or_before_with(inst)
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1804, in setup_or_before_with
    unimplemented(f"{inst.opname} {ctx}")
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/exc.py", line 190, in unimplemented
    raise Unsupported(msg)
torch._dynamo.exc.Unsupported: SETUP_WITH UserDefinedObjectVariable(Profile)

from user code:
   File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/_dynamo/external_utils.py", line 25, in inner
    return fn(*args, **kwargs)
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/vboxuser/miniconda3/envs/executorch2/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/vboxuser/.cache/torch/hub/ultralytics_yolov5_master/models/common.py", line 681, in forward
    with dt[0]:

Any specific instructions on how to load in differently? I'm very new to this.

Thanks!

EarthMu commented 7 months ago

@khimanir

Hi, sorry for late reply. I think this env or method is not supported byPyTorch dynamo. And you need to change this compatible with PyTorch dynamo.

torch._dynamo.exc.Unsupported: SETUP_WITH UserDefinedObjectVariable(Profile)

Thanks,

liuyibox commented 1 week ago

Any progress on this? I still cannot export yolov5 or yolov8 successfully after trials for a whole day. Here is my code. The error message is also attached. I'm using executorch 0.3, torch 2.4.0. So far, I can export simple models, but not yolo5 or yolo8. Any help would be appreciated.

BTW @khimanir I followed your scripts, and got the same errors as you.

import torch
from ultralytics import YOLO
import os
HOME = os.getcwd()

class YOLOv10Core(torch.nn.Module):
    def __init__(self, model):
        super(YOLOv10Core, self).__init__()
        self.model = model.model
    def forward(self, x):
        return self.model(x)

original_model = YOLO(f"{HOME}/yolov8n.pt")
yolo_core_model = YOLOv10Core(original_model)
input_tensor = torch.ones(1, 3, 640, 640)
aten_dialect = torch.export.export(yolo_core_model, (input_tensor,))

torch._dynamo.exc.InternalTorchDynamoError: Pending unbacked symbols {zuf0} not in returned outputs FakeTensor(..., size=(6400, 1)) ((1, 1), 0). Did you accidentally call new_dynamicsize() or item() more times than you needed to in your fake implementation? For more help, see https://docs.google.com/document/d/1RWrH-3wLEpzR9kCS6gGBNen-Fs-8PVbWWFE5AcgeWE/edit

from user code: File "/home/liuyi/executorch/examples/models/ems_yolo10/export_yolo.py", line 11, in forward return self.model(x) File "/home/liuyi/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl return forward_call(*args, kwargs) File "/home/liuyi/anaconda3/envs/executorch/lib/python3.10/site-packages/ultralytics/nn/tasks.py", line 94, in forward return self.predict(x, *args, *kwargs) File "/home/liuyi/anaconda3/envs/executorch/lib/python3.10/site-packages/ultralytics/nn/tasks.py", line 112, in predict return self._predict_once(x, profile, visualize, embed) File "/home/liuyi/anaconda3/envs/executorch/lib/python3.10/site-packages/ultralytics/nn/tasks.py", line 133, in _predict_once x = m(x) # run File "/home/liuyi/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl return forward_call(args, kwargs) File "/home/liuyi/anaconda3/envs/executorch/lib/python3.10/site-packages/ultralytics/nn/modules/head.py", line 86, in forward return self.inference(y) File "/home/liuyi/anaconda3/envs/executorch/lib/python3.10/site-packages/ultralytics/nn/modules/head.py", line 50, in inference self.anchors, self.strides = (x.transpose(0, 1) for x in make_anchors(x, self.stride, 0.5)) File "/home/liuyi/anaconda3/envs/executorch/lib/python3.10/site-packages/ultralytics/utils/tal.py", line 305, in make_anchors stride_tensor.append(torch.full((h * w, 1), stride, dtype=dtype, device=device))

mcr229 commented 4 days ago

Hi Sorry, this issues seems like its coming from torch.export.export. Do you mind cross posting this in PyTorch/PyTorch?

cc @angelayi @tugsbayasgalan

pytorch / executorch