pytorch / executorch

On-device AI across mobile, embedded and edge for PyTorch
https://pytorch.org/executorch/
Other
1.39k stars 228 forks source link

Upcoming changes to export API in ExecuTorch (published on 9/12/2023) #290

Open kimishpatel opened 9 months ago

kimishpatel commented 9 months ago

Where are we?

Exporting pytorch model for ExecuTorch runtime goes through multiple AoT (Ahead of Time) stages. At high level there are 3 stages.

  1. exir.capture: This captures model’s graph using ATen IR.
  2. to_edge: translate ATen dialect into edge dialect with dtype specialization.
  3. to_executorch: translate edge dialect to executorch dialect, along with running various passes, e.g. out variant, memory planning etc., to make model ready for executorch runtime.

Two important stops in model’s journey to executorch runtime are: a) quantization and b) delegation.

Entry points for quantization are between step 1 and 2. Thus quantization APIs consume ATen IR and are not edge/executorch specific.

Entry points for delegation are between step 2 and 3. Thus delegation APIs consume edge dialect IR.

Need for the export API change.

Quantization workflow is built on top of exir.capture which is built on top of torch.export API. In order to support QAT, such exported models need to work with eager mode autograd. Current export, of step 1 above, emits ATen IR with core ATen ops. This is not autograd safe, meaning it is not safe to run such an exported model in eager mode (e.g. in python), and, expect the autograd engine to work. Thus training APIs, such as calculating loss on the output and calling backward on the loss, are not guaranteed to work with this IR.

It is important that quantization APIs, for QAT as well as PTQ, work on the same IR, because a) it provides better UX to the users and b) it provides a single IR that backend specific quantizers (read more here) can target.

For this reason we aligned on two stage export, that is rooted in the idea of progressive lowering. The two stages are:

  1. Export emits pre-dispatch ATen IR
  2. Pre-dispatch ATen IR is lowered to core ATen IR.

Output of stage 1 is autograd safe and thus models exported at 1 can be trained via eager mode autograd engine.

New export API.

We are rolling out changes related to new export API in three stages.

Stage 1 (landed):

As shown in the figure below, exir.capture is broken down into:

image

Example of exporting model without quantization:

gm = export.capture_pre_autograd_graph(m)
ep = exir.capture(gm) # to be replaced with torch.export

Example of exporting model with quantization:

gm = torch.capture_pre_autograd_graph(m)
quantized_gm = calls_to_quantizaiton_api(gm)
ep = exir.capture(quantized_gm) # to be replaced with torch.export

You can see these changes here and here for how quantization APIs fit in.

Stage 2 (coming soon):

We will deprecate exir.capture in favor of directly using torch.export. More updates on this will be posted soon.

Stage 3 (timeline is to be determined):

The two APIs listed in stage 1 will be renamed to:

torch.export will export graph with ATen IR, and full ATen opset, that is autograd safe, while to_core_aten will transform output of torch.export into core ATen IR that is NOT autograd safe.

Example of exporting model without quantization:

ep = torch.export(model)
ep = ep.to_core_aten()

Example of exporting model with quantization:

ep = torch.export(model)

gm = ep.module() # obtain fx.GraphModule. API name may change
quantized_gm = calls_to_quantizaiton_api(gm)
quantized_ep = torch.export(quantized_gm) # re-export for API compatibility

ep = quantized_ep.to_core_aten()

Timeline for this is to be determined, but this will NOT happen before PyTorch conference on 10/16/2023.

Why this change?

There are a couple of reasons: This change aligns well with the long term state where capture_pre_autograd_graph is replaced with torch.export to obtain autograd safe aten IR, and the current use of exir.capture (or torch.export when replaced) will be replaced with to_core_aten to obtain ATen IR with core ATen opset.

In the long term, export for quantization wont be separate. Quantization will be an optional step, like delegation, in the export journey. Thus aligning with that in the short terms helps because:

Why the change now?

To minimize the migration pain later and have better alignment with the long term changes.

adonnini commented 7 months ago

Hi Kimish, Should we use exir.capture or torch.export now that capture is deprecated?

I am having trouble with exir.capture.to_edge

when I run it using my model as arguments for capture it fails with the traceback listed below. It looks like the model fails to run to completion even though it runs to completion when executed outside of executorch.

I thought of adding constraints but I am not sure how to do that.

Please let me know if you need additional information

Thanks

Traceback (most recent call last):
  File "/home/adonnini1/Development/ContextQSourceCode/NeuralNetworks/trajectory-prediction-transformers-master/executorch/exir/tracer.py", line 667, in dynamo_trace
    return torchdynamo.export(
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_dynamo/eval_frame.py", line 1213, in inner
    result_traced = opt_f(*args, **kwargs)
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1519, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1528, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_dynamo/eval_frame.py", line 401, in _fn
    return fn(*args, **kwargs)
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1519, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1528, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_dynamo/eval_frame.py", line 549, in catch_errors
    return callback(frame, cache_entry, hooks, frame_state)
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_dynamo/convert_frame.py", line 142, in _fn
    return fn(*args, **kwargs)
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_dynamo/convert_frame.py", line 384, in _convert_frame_assert
    return _compile(
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_dynamo/convert_frame.py", line 570, in _compile
    guarded_code = compile_inner(code, one_graph, hooks, transform)
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_dynamo/utils.py", line 221, in time_wrapper
    r = func(*args, **kwargs)
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_dynamo/convert_frame.py", line 492, in compile_inner
    out_code = transform_code_object(code, transform)
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_dynamo/bytecode_transformation.py", line 1028, in transform_code_object
    transformations(instructions, code_options)
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_dynamo/convert_frame.py", line 462, in transform
    tracer.run()
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_dynamo/symbolic_convert.py", line 2107, in run
    super().run()
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_dynamo/symbolic_convert.py", line 747, in run
    and self.step()
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_dynamo/symbolic_convert.py", line 710, in step
    getattr(self, inst.opname)(inst)
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_dynamo/symbolic_convert.py", line 405, in wrapper
    return inner_fn(self, inst)
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_dynamo/symbolic_convert.py", line 1143, in CALL_FUNCTION
    self.call_function(fn, args, {})
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_dynamo/symbolic_convert.py", line 582, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_dynamo/variables/functions.py", line 307, in call_function
    return super().call_function(tx, args, kwargs)
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_dynamo/variables/functions.py", line 261, in call_function
    return super().call_function(tx, args, kwargs)
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_dynamo/variables/functions.py", line 90, in call_function
    return tx.inline_user_function_return(
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_dynamo/symbolic_convert.py", line 618, in inline_user_function_return
    result = InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_dynamo/symbolic_convert.py", line 2234, in inline_call
    return cls.inline_call_(parent, func, args, kwargs)
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_dynamo/symbolic_convert.py", line 2358, in inline_call_
    tracer.run()
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_dynamo/symbolic_convert.py", line 747, in run
    and self.step()
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_dynamo/symbolic_convert.py", line 710, in step
    getattr(self, inst.opname)(inst)
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_dynamo/symbolic_convert.py", line 405, in wrapper
    return inner_fn(self, inst)
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_dynamo/symbolic_convert.py", line 1143, in CALL_FUNCTION
    self.call_function(fn, args, {})
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_dynamo/symbolic_convert.py", line 582, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_dynamo/variables/nn_module.py", line 309, in call_function
    return wrap_fx_proxy(
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_dynamo/variables/builder.py", line 1304, in wrap_fx_proxy
    return wrap_fx_proxy_cls(
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_dynamo/variables/builder.py", line 1391, in wrap_fx_proxy_cls
    example_value = get_fake_value(proxy.node, tx)
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_dynamo/utils.py", line 1422, in get_fake_value
    raise TorchRuntimeError(str(e)).with_traceback(e.__traceback__) from None
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_dynamo/utils.py", line 1383, in get_fake_value
    return wrap_fake_exception(
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_dynamo/utils.py", line 952, in wrap_fake_exception
    return fn()
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_dynamo/utils.py", line 1384, in <lambda>
    lambda: run_node(tx.output, node, args, kwargs, nnmodule)
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_dynamo/utils.py", line 1483, in run_node
    raise RuntimeError(fn_str + str(e)).with_traceback(e.__traceback__) from e
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_dynamo/utils.py", line 1467, in run_node
    return nnmodule(*args, **kwargs)
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1519, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1528, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/nn/modules/linear.py", line 114, in forward
    return F.linear(input, self.weight, self.bias)
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/utils/_stats.py", line 20, in wrapper
    return fn(*args, **kwargs)
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_subclasses/fake_tensor.py", line 1323, in __torch_dispatch__
    return self.dispatch(func, types, args, kwargs)
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_subclasses/fake_tensor.py", line 1529, in dispatch
    return decomposition_table[func](*args, **kwargs)
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_prims_common/wrappers.py", line 240, in _fn
    result = fn(*args, **kwargs)
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_decomp/decompositions.py", line 72, in inner
    r = f(*tree_map(increase_prec, args), **tree_map(increase_prec, kwargs))
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_decomp/decompositions.py", line 1306, in addmm
    out = alpha * torch.mm(mat1, mat2)
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/utils/_stats.py", line 20, in wrapper
    return fn(*args, **kwargs)
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_subclasses/fake_tensor.py", line 1323, in __torch_dispatch__
    return self.dispatch(func, types, args, kwargs)
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_subclasses/fake_tensor.py", line 1621, in dispatch
    r = func(*args, **kwargs)
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_ops.py", line 516, in __call__
    return self._op(*args, **kwargs or {})
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_meta_registrations.py", line 1891, in meta_mm
    torch._check(
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/__init__.py", line 1028, in _check
    _check_with(RuntimeError, cond, message)
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/__init__.py", line 1011, in _check_with
    raise error_type(message_evaluated)
torch._dynamo.exc.TorchRuntimeError: Failed running call_module L__self___encoder_embedding_linear_embd(*(FakeTensor(..., size=(0,), dtype=torch.float64),), **{}):
a and b must have same reduction dim, but got [1, 0] X [2, 512].

from user code:
   File "/home/adonnini1/Development/ContextQSourceCode/NeuralNetworks/trajectory-prediction-transformers-master/model.py", line 473, in forward
    enc_embed = self.encoder_embedding.forward(enc_input)
  File "/home/adonnini1/Development/ContextQSourceCode/NeuralNetworks/trajectory-prediction-transformers-master/model.py", line 384, in forward
    x = self.linear_embd(x) * math.sqrt(self.emb_size)     # Shape = (B, N, C)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/adonnini1/Development/ContextQSourceCode/NeuralNetworks/trajectory-prediction-transformers-master/train.py", line 297, in <module>
    print(exir.capture(m, (VAL_INPUT, DEC_INPUT, DEC_SOURCE_MASK, DEC_TARGET_MASK)).to_edge())
  File "/home/adonnini1/Development/ContextQSourceCode/NeuralNetworks/trajectory-prediction-transformers-master/executorch/exir/capture/_capture.py", line 146, in capture
    graph_module, _ = dynamo_trace(
  File "/home/adonnini1/Development/ContextQSourceCode/NeuralNetworks/trajectory-prediction-transformers-master/executorch/exir/tracer.py", line 686, in dynamo_trace
    raise InternalError(
executorch.exir.error.InternalError: torchdynamo internal error occured. Please see above stacktrace
adonnini commented 7 months ago

Hi, I produced an exported model with torch.export. If I save it using torch.export.save can I use the model file for inference in my Android application?

Put another way what is the equivalent of open("tfmodel.pte", "wb").write(.exir.capture().to_edge().to_executorch().buffer) when using torch.export?

Thanks

kimishpatel commented 7 months ago

@adonnini

Let me know if this answers your question.

adonnini commented 7 months ago

Hi @kimishpatel thanks for the response. I did save the model exported with torch.export without any problems. And, I did read the examples and related tutorials (several times).

Unfortunately, exir.capture does not work for me. As you can see from the traceback I posted in my message yesterday (please see above).

I also tried to load a saved model (and its dictionary) and then use it as one of the arguments for exir.capture. It failed with the traceback below.

torch.export works with a saved model.

I cannot see from the traceback above what causes the exir.capture failure. What do you think? What should I do next ?

TRACEBACK PRODUCED WHEN USING SAVED MODEL AS ARGUMENT IN exir.capture

  File "/home/adonnini1/Development/ContextQSourceCode/NeuralNetworks/trajectory-prediction-transformers-master/train.py", line 334, in <module>
    print(exir.capture(model_loaded, (VAL_INPUT, DEC_INPUT, DEC_SOURCE_MASK, DEC_TARGET_MASK)).to_edge())
  File "/home/adonnini1/Development/ContextQSourceCode/NeuralNetworks/trajectory-prediction-transformers-master/executorch/exir/program/_program.py", line 168, in to_edge
    return _to_edge(self, config)
  File "/home/adonnini1/Development/ContextQSourceCode/NeuralNetworks/trajectory-prediction-transformers-master/executorch/exir/program/_program.py", line 283, in _to_edge
    EXIRATenDialectVerifier()(ep.exported_program.graph_module)
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_export/verifier.py", line 58, in __call__
    self.check_valid(gm)
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_export/verifier.py", line 117, in check_valid
    self.check_valid_op(node.target)
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/_export/verifier.py", line 166, in check_valid_op
    raise SpecViolationError(
torch._export.verifier.SpecViolationError: Operator torch._ops.aten.detach.default is not Aten Canonical.
kimishpatel commented 7 months ago

oh seems like aten.detach is not considered core op. For now make this, https://github.com/pytorch/executorch/blob/main/exir/capture/_config.py#L34, False. And try again. Note that you can pass it also via config to to_edge. cc: @guangy10

adonnini commented 7 months ago

I made the change you suggested. Now the code fails with the following error:

Traceback (most recent call last):
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/fx/passes/infra/pass_manager.py", line 270, in __call__
    res = fn(module)
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/fx/passes/infra/pass_base.py", line 41, in __call__
    self.ensures(graph_module)
  File "/home/adonnini1/Development/ContextQSourceCode/NeuralNetworks/trajectory-prediction-transformers-master/executorch/exir/passes/__init__.py", line 311, in ensures
    raise RuntimeError(f"Missing out variants: {self.missing_out_vars}")
RuntimeError: Missing out variants: {'aten::alias'}

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/adonnini1/Development/ContextQSourceCode/NeuralNetworks/trajectory-prediction-transformers-master/train.py", line 340, in <module>
    open("tfmodel.pte", "wb").write(exir.capture(model_loaded, (VAL_INPUT, DEC_INPUT, DEC_SOURCE_MASK, DEC_TARGET_MASK)).to_edge().to_executorch().buffer)
  File "/home/adonnini1/Development/ContextQSourceCode/NeuralNetworks/trajectory-prediction-transformers-master/executorch/exir/program/_program.py", line 181, in to_executorch
    new_prog = ep._transform(*edge_to_executorch_passes(config))
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/export/exported_program.py", line 569, in _transform
    res = pm(self.graph_module)
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/fx/passes/infra/pass_manager.py", line 296, in __call__
    raise Exception(msg) from e
Exception: An error occurred when running the 'ToOutVarPass' pass after the following passes: ['SpecPropPass', 'EdgeToBackendOpsPass', 'RemoveAssertAsyncPass', 'HintBasedSymShapeEvalPass']
kimishpatel commented 7 months ago

Stange. @larryliu0820 can you take a look. alias doesnt have out variant but alias_copy does in native_functions.yaml. Not sure why functionalization is not generating alias_copy. Maybe @bdhirsh knows

larryliu0820 commented 7 months ago

detach should be removed from the graph, not sure why it sticks. For alias, my understanding is functionalization may not replace it with an alias_copy if we never change the value of the alias result. @bdhirsh answered this: https://discuss.pytorch.org/t/aten-ir-and-mutation-in-place/172129/2

So I think alias can be removed because it's a no-op?

adonnini commented 7 months ago

Did you see

https://github.com/pytorch/executorch/issues/1132#issuecomment-1791108182

adonnini commented 4 months ago

@kimishpatel , sorry to bother you. I was wondering when you think there will be an update on the issue https://github.com/pytorch/executorch/issues/1350 At this point, I am stuck as my app can load the model using the executorch runtime engine but cannot proceed further because of https://github.com/pytorch/executorch/issues/1350 Thanks

kimishpatel commented 4 months ago

@kimishpatel , sorry to bother you. I was wondering when you think there will be an update on the issue pytorch/executorch#1350 At this point, I am stuck as my app can load the model using the executorch runtime engine but cannot proceed further because of pytorch/executorch#1350 Thanks

Apologies for late response. was on pto. Let me follow up on the issue

adonnini commented 4 months ago

@kimishpatel sorry for coming back to you. I have received no response to the two issues that are blocking progress on my work.

https://github.com/pytorch/pytorch/issues/120219 and https://github.com/pytorch/executorch/issues/1350

I realize that we may still be during leave period. If that is the case, please let me know when I should touch base again.

Thanks for your patience and your help

kimishpatel commented 4 months ago

@adonnini dont apologize. You have been very patient. Let me follow up and see whats happening.

adonnini commented 3 months ago

Hi @kimishpatel , I hope you are well. I opened two issues: https://github.com/pytorch/executorch/issues/2204 https://github.com/pytorch/executorch/issues/2163 both at the beginning of last week. To date, I did not receive any feedback. I know your team is dealing with many issues (I can see the list of open issues getting longer). Would it be possible to let me know when someone will take a look at these two issues? I am getting closer to being able to run my models for inference from my Android application. I wish I could resolve these two problem. I can't without your help

adonnini commented 3 months ago

@kimishpatel @kimishpatel I don't know if I did something wrong but issue https://github.com/pytorch/pytorch/issues/120219 has resurrfaced even though I used the strict=False workaround. Please note that for executorch I used the main branch, not release https://github.com/pytorch/executorch/releases/tag/v0.1.0

kimishpatel commented 3 months ago

Yeah I don tknow the compatiblity with v0.1.0. I

adonnini commented 3 months ago

@kimishpatel Hi, In the next few weeks we will start test deployments of the Android application. I would love to have the model run-for-inference function using executorch running by then. I truly am always hesitant to contact you knowing how much you have on your plate. I have note received any follow-up on these issues https://github.com/pytorch/executorch/issues/2204 https://github.com/pytorch/executorch/issues/2163 https://github.com/pytorch/pytorch/issues/120219 It's been a few weeks for all of them Please let me know if there is something I should be doing to help resolve these issues. Thanks

adonnini commented 1 month ago

@kimishpatel I hope you are well. In a couple of weeks we will start deployment of my Android application. I would love to be able to include run for inference of the two models I am using to predict user location. Two issues: https://github.com/pytorch/executorch/issues/2163 https://github.com/pytorch/pytorch/issues/120219 are blocking my progress. It's been a few weeks since I have heard anything about their resolution @jansel tried to help me suggesting that https://github.com/pytorch/pytorch/pull/123318 might also resolve https://github.com/pytorch/pytorch/issues/120219 Unfortunately, it did not. I realize that it's a matter of priorities and that the team is focusing on the upcoming executorch release. Please let me know if there is anything I can do. Any update would be greatly appreciated. Thanks for your patience as you keep receiving my messages.

kimishpatel commented 1 month ago

@adonnini thanks for bringing this back. Let me raise it internally and see what traction we get. I truly appreciate how you ahve been trying to make this work.

adonnini commented 1 month ago

@kimishpatel A quick update. @angelayi has been very helpful in trying to solve https://github.com/pytorch/pytorch/issues/120219 We are making progress. I have not heard back regarding https://github.com/pytorch/executorch/issues/2163 I am waiting for a response from @lucylq (@kirklandsign asked her to take a look) Thanks

kimishpatel commented 1 month ago

Ok let me ping them agai

adonnini commented 1 month ago

@kimishpatel Thanks for your help. I really appreciate it! I think https://github.com/pytorch/executorch/issues/2163 is pretty close to being resolved. @kirklandsign was very helpful. Resolving https://github.com/pytorch/executorch/issues/2163 brought up again https://github.com/pytorch/executorch/issues/1350 which I have been waiting to hear since February. Thanks

kimishpatel commented 1 month ago

@adonnini no problem and thank you for your patience. I really appreciate it. I think @kirklandsign should be able to help you resolve it. If not, please bring it to my attention again. Thanks

adonnini commented 1 month ago

Thanks! With regards to https://github.com/pytorch/executorch/issues/1350 I was communicating with @mcr229 who told me that the fix to the issue would be available by around the end of February.

kimishpatel commented 1 month ago

@adonnini do you know if the error happens only with android app or have you also trying running the model vai standalone binary, like executor runner, https://github.com/pytorch/executorch/tree/main/examples/portable/executor_runner

adonnini commented 1 month ago

@kimishpatel I am not familiar with executor runner. When I ran the model for inference outside of executorch it worked as expected. Is this what you were asking? By the way, I ran the model for inference via my Android app lowering it using pytorch mobile (torchscript) Thanks