tensorflow / tensorflow

An Open Source Machine Learning Framework for Everyone
https://tensorflow.org
Apache License 2.0
184.95k stars 74.13k forks source link

TensorFlow Lite with iOS MTLBuffer doesn't support dynamic shape? #71740

Open zhanghuicuc opened 1 month ago

zhanghuicuc commented 1 month ago

Issue type

Feature Request

Have you reproduced the bug with TensorFlow Nightly?

Yes

Source

source

TensorFlow version

tf 2.16.1

Custom code

No

OS platform and distribution

iOS

Mobile device

iPhone

Python version

No response

Bazel version

No response

GCC/compiler version

No response

CUDA/cuDNN version

No response

GPU model and memory

No response

Current behavior?

I'm trying to use tflite with Metal MTLBuffer on iOS following the doc : https://www.tensorflow.org/lite/ios/delegates/gpu#inputoutput_buffers_using_c_api

My model.tflite is designed for dynamic shape usecase, so the input/output shape is saved as [1,-1,-1,1] in the model.

I've tried to call ResizeInputTensor before ModifyGraphWithDelegate , but it causes Execution of the command buffer was aborted due to an error during execution. Caused GPU Address Fault Error (0000000b:kIOGPUCommandBufferCallbackErrorPageFault) when calling Invoke.

If I don't call ResizeInputTensor before ModifyGraphWithDelegate, then I still got nothing from output tensor.

I want to know if tensorflowlite metal delegate supports dynamic shape or not, and how to use it with dynamic shape correctly?

Standalone code to reproduce the issue

tflite::ops::builtin::BuiltinOpResolver op_resolver;
        tflite::InterpreterBuilder interpreter_builder(model, op_resolver);

         // Configure and create the delegate.
        TFLGpuDelegateOptions options;
        options.enable_quantization = true;
        options.allow_precision_loss = true;
        options.wait_type = TFLGpuDelegateWaitType::TFLGpuDelegateWaitTypeActive;
        _gpu_delegate = TFLGpuDelegateCreate(&options);

        if (interpreter_builder(&_predictor) != kTfLiteOk || !_predictor) {
            GLOGE("Unable to prepare TfLite interpreter.");
        }
        TfLiteStatus status;
       status  = _predictor->ResizeInputTensor(0, {1, input_height, input_width, 1});
        if (status != kTfLiteOk) {
            GLOGE("Failed to resize input tensor: {}", status);
            return;
        }

        status = _predictor->ModifyGraphWithDelegate(_gpu_delegate);
        if (status != kTfLiteOk) {
            GLOGE("Failed to ModifyGraphWithDelegate: {}", status);
            return;
        }

        _predictor->SetAllowBufferHandleOutput(true);  // disable default gpu->cpu copy

       // id<MTLBuffer> input and  id<MTLBuffer> output from other parts of my codes
        if (!TFLGpuDelegateBindMetalBufferToTensor(
            _gpu_delegate, _predictor->inputs()[0], input)) {
            GLOGE("Failed to TFLGpuDelegateBindMetalBufferToTensor input");
            return false;
        }
        if (!TFLGpuDelegateBindMetalBufferToTensor(
                _gpu_delegate, _predictor->outputs()[0], output)) {
            GLOGE("Failed to TFLGpuDelegateBindMetalBufferToTensor output");
            return false;
        }

        id<MTLCommandBuffer> command_buffer = [_metal_queue commandBuffer];
        command_buffer.label = @"TfliteMetalRunner";
        TFLGpuDelegateSetCommandBuffer(_gpu_delegate, command_buffer);

        if (_predictor->Invoke() != kTfLiteOk) {
            GLOGE("metal runner invoke failed");
            return false;
        }
            GLOGE("metal runner invoke success");

        [command_buffer commit];
        [command_buffer waitUntilScheduled];

Relevant log output

2024-07-12 18:54:38.834941+0800 myapp[6052:2224696] Execution of the command buffer was aborted due to an error during execution. Caused GPU Address Fault Error (0000000b:kIOGPUCommandBufferCallbackErrorPageFault)
2024-07-12 18:54:38.890017+0800 myapp[6052:2214145] Execution of the command buffer was aborted due to an error during execution. Caused GPU Address Fault Error (0000000b:kIOGPUCommandBufferCallbackErrorPageFault)
2024-07-12 18:54:38.920375+0800 myapp[6052:2214145] Execution of the command buffer was aborted due to an error during execution. Ignored (for causing prior/excessive GPU errors) (00000004:kIOGPUCommandBufferCallbackErrorSubmissionsIgnored)
2024-07-12 18:54:38.920508+0800 myapp[6052:2214145] Execution of the command buffer was aborted due to an error during execution. Ignored (for causing prior/excessive GPU errors) (00000004:kIOGPUCommandBufferCallbackErrorSubmissionsIgnored)
pkgoogle commented 1 month ago

Hi @zhanghuicuc, we definitely expect dynamic shapes for C++/Python interfaces: https://www.tensorflow.org/lite/guide/inference#run_inference_with_dynamic_shape_model ... I don't see an example for MTLBuffer ... so let's assume it's supported for now. Do you have your .tflite model/file available so that I may test/reproduce this? I'm assuming you are using xcode, if you can share as much as your project as possible, that will help us look into this. Thanks.

zhanghuicuc commented 1 month ago

@pkgoogle could you tell me your e-mail? so I can send you the model and codes.

pkgoogle commented 1 month ago

Hi @zhanghuicuc, is there any way you can use drive.google.com? It is easier for us to share via that (you have to give me or others permission or something like that). For my own safety, I tend not to give out my email.

zhanghuicuc commented 1 month ago

@pkgoogle tflite model link: https://drive.google.com/file/d/1ZsmF1kiXlzOYNHaVR7a4h4kXPZ5_Ercc/view?usp=sharing

zhanghuicuc commented 1 month ago

@pkgoogle xcode project demo link: https://drive.google.com/file/d/1v_osYI36D_frhMYBS1VEkjieBon0s0RT/view?usp=sharing The related codes are in tflite_metal_runner.mm.

pkgoogle commented 1 month ago

@zhanghuicuc, I have requested access, please grant when appropriate.

zhanghuicuc commented 1 month ago

@pkgoogle granted

pkgoogle commented 1 month ago

I'm running into simulator issues reproducing your issue. @yishuangP, can you please take a look? Thanks.

zhanghuicuc commented 1 month ago

@yishuangP I'd like to know if there is any update or any infromation you want me to provide ?

zhanghuicuc commented 2 weeks ago

@pkgoogle hi,any update here?