google-ai-edge / LiteRT

LiteRT is the new name for TensorFlow Lite (TFLite). While the name is new, it's still the same trusted, high-performance runtime for on-device AI, now with an expanded vision.
https://ai.google.dev/edge/litert
Apache License 2.0
168 stars 13 forks source link

GPUv2 segfaults on split-head attention CLIP model #65

Open gaikwadrahul8 opened 3 days ago

gaikwadrahul8 commented 3 days ago

System information

Standalone code to reproduce the issue

Model asset: tflite_66721_sha_clip_gpuv2_segfault.tflite

Run model through TFLite (GPUv2) on an Android device (for instance through benchmark tool).

Any other info / logs

Runtime log (executed on https://aihub.qualcomm.com/)

[30/Apr/2024:10:26:55 -07:00: profiler/info] -=- Tungsten Initializing -=-
[30/Apr/2024:10:26:55 -07:00: profiler/info] Android system property: ro.board.platform = gs201
[30/Apr/2024:10:26:55 -07:00: profiler/info] Android system property: ro.boot.hardware = panther
[30/Apr/2024:10:26:55 -07:00: profiler/info] Android system property: ro.boot.hardware.platform = gs201
[30/Apr/2024:10:26:55 -07:00: profiler/info] Android system property: ro.system.build.id = TQ1A.221205.011
[30/Apr/2024:10:26:55 -07:00: profiler/info] Android system property: ro.system.build.version.release = 13
[30/Apr/2024:10:26:55 -07:00: profiler/info] Android system property: ro.hardware = panther
[30/Apr/2024:10:26:55 -07:00: profiler/info] Android system property: ro.hardware.chipname = 
[30/Apr/2024:10:26:55 -07:00: profiler/info] Android system property: ro.product.board = panther
[30/Apr/2024:10:26:55 -07:00: profiler/info] Android system property: ro.product.brand = google
[30/Apr/2024:10:26:55 -07:00: profiler/info] Android system property: ro.product.device = panther
[30/Apr/2024:10:26:55 -07:00: profiler/info] Android system property: ro.product.build.fingerprint = google/panther/panther:13/TQ1A.221205.011/9244662:user/release-keys
[30/Apr/2024:10:26:55 -07:00: profiler/info] Android system property: ro.product.manufacturer = Google
[30/Apr/2024:10:26:55 -07:00: profiler/info] Android system property: ro.product.model = Pixel 7
[30/Apr/2024:10:26:55 -07:00: profiler/info] Android system property: ro.product.name = panther
[30/Apr/2024:10:26:55 -07:00: profiler/info] Android system property: ro.soc.manufacturer = Google
[30/Apr/2024:10:26:55 -07:00: profiler/info] Android system property: ro.soc.model = GS201
[30/Apr/2024:10:26:55 -07:00: profiler/info] [Manager] DeviceManager::DeviceManager
[30/Apr/2024:10:26:55 -07:00: profiler/info] [Manager] findAvailableDevices
[30/Apr/2024:10:26:55 -07:00: profiler/info] [Manager] Found interface google-edgetpu (version = 2.0)
[30/Apr/2024:10:26:55 -07:00: profiler/info] [Manager] Found interface google-armnn (version = ArmNN)
[30/Apr/2024:10:26:55 -07:00: profiler/info] NNAPI devices: google-edgetpu,google-armnn,nnapi-reference
[30/Apr/2024:10:26:55 -07:00: profiler/info] GPU device: ARM Mali-G710
[30/Apr/2024:10:26:55 -07:00: profiler/info] OpenGL Version: OpenGL ES 3.2 v1.r36p0-01eac0.1f36dec337e44918d811de9a8a2acf4d
[30/Apr/2024:10:26:55 -07:00: profiler/info] OpenCL Version: OpenCL C 1.2 v1.r36p0-01eac0.1f36dec337e44918d811de9a8a2acf4d
[30/Apr/2024:10:26:55 -07:00: profiler/info] -=- Tungsten Running Task: Loading -=-
[30/Apr/2024:10:26:55 -07:00: profiler/info] Detected chipset 3101, made by 3000.
[30/Apr/2024:10:26:55 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Loading tflite model Models/model.tflite
[30/Apr/2024:10:26:55 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Malloc VM size before: 24632.0 kB, allocated: 13796.0 kB, slack: 10836.0 kB.
[30/Apr/2024:10:26:55 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Current memory baseline range: 57552.0-68388.0 kB.
[30/Apr/2024:10:26:55 -07:00: profiler/debug] [job_id: jygz19nxp] [model.tflite] Runtime metadata not found in Models/model.tflite/trt_metadata.json or Models/model.tflite/trt_metadata.pb
[30/Apr/2024:10:26:55 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] TF Lite version 2.16.1. Loading model from Models/model.tflite.
[30/Apr/2024:10:26:55 -07:00: profiler/debug] [job_id: jygz19nxp] [model.tflite] Mapping resource file in Models/model.tflite
[30/Apr/2024:10:26:55 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Loaded model. Minimum TF Lite version = 2.3.0.
[30/Apr/2024:10:26:55 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] No delegates specified; using compute unit=cpu_and_gpu.
[30/Apr/2024:10:26:55 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] [tflite] Initialized TensorFlow Lite runtime.
[30/Apr/2024:10:26:55 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] GPUV2 delegate requested. OpenCL detected.
[30/Apr/2024:10:26:55 -07:00: profiler/debug] [job_id: jygz19nxp] [model.tflite] Enabling delegate cache in dir=/data/user/0/ai.tetra.tungsten/cache/1714498015468/ai.tetra.runtime/0.6.0/model.tflite_8945450969824422876_1714498006500/gpuv2.
[30/Apr/2024:10:26:55 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] [tflite] Created TensorFlow Lite delegate for GPU.
[30/Apr/2024:10:26:55 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] [tflite] Replacing 2003 out of 2003 node(s) with delegate (TfLiteGpuDelegateV2) node, yielding 1 partitions for the whole graph.
[30/Apr/2024:10:26:56 -07:00: profiler/warning] [job_id: jygz19nxp] [model.tflite] [tflite] File /data/user/0/ai.tetra.tungsten/cache/1714498015468/ai.tetra.runtime/0.6.0/model.tflite_8945450969824422876_1714498006500/gpuv2/gpuv2_1297717803319390986.bin couldn't be opened for reading: No such file or directory
[30/Apr/2024:10:27:00 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] [tflite] Initialized OpenCL-based API.
[30/Apr/2024:10:27:01 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] [tflite] Created 1 GPU delegate kernels.
[30/Apr/2024:10:27:01 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Applied 1 delegates: GPUV2/OpenCL. Model is fully delegated=true.
[30/Apr/2024:10:27:01 -07:00: profiler/debug] [job_id: jygz19nxp] [model.tflite] Saving delegate selection for subsequent steps.
[30/Apr/2024:10:27:01 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Malloc VM size after: 256412.0 kB, allocated: 233690.0 kB, slack: 22722.0 kB.
[30/Apr/2024:10:27:01 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Status Successfully Loaded Cold with t = 5381726 us and usage: before = 68388.0 kB; peakBefore = 68388.0 kB; mallocUnusedBefore = 10836.0 kB; after = 291732.0 kB; peakAfter = 805160.0 kB; mallocUnusedAfter = 22722.0 kB; increase = 200622.0-211458.0 kB; peak = 736772.0-747608.0 kB
[30/Apr/2024:10:27:01 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Saving results to /storage/emulated/0/Android/data/ai.tetra.tungsten/files/Results/job_jygz19nxp/job_jygz19nxp_results.bin
[30/Apr/2024:10:27:01 -07:00: profiler/info] -=- Tungsten Running Task: Loading -=-
[30/Apr/2024:10:27:01 -07:00: profiler/info] Detected chipset 3101, made by 3000.
[30/Apr/2024:10:27:01 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Loading previously saved results in /storage/emulated/0/Android/data/ai.tetra.tungsten/files/Results/job_jygz19nxp/job_jygz19nxp_results.bin
[30/Apr/2024:10:27:01 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Loading tflite model Models/model.tflite
[30/Apr/2024:10:27:01 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Malloc VM size before: 77880.0 kB, allocated: 16704.4 kB, slack: 61175.6 kB.
[30/Apr/2024:10:27:01 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Current memory baseline range: 25732.4-86908.0 kB.
[30/Apr/2024:10:27:01 -07:00: profiler/debug] [job_id: jygz19nxp] [model.tflite] Runtime metadata not found in Models/model.tflite/trt_metadata.json or Models/model.tflite/trt_metadata.pb
[30/Apr/2024:10:27:01 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] TF Lite version 2.16.1. Loading model from Models/model.tflite.
[30/Apr/2024:10:27:01 -07:00: profiler/debug] [job_id: jygz19nxp] [model.tflite] Mapping resource file in Models/model.tflite
[30/Apr/2024:10:27:01 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Loaded model. Minimum TF Lite version = 2.3.0.
[30/Apr/2024:10:27:01 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] GPUV2 delegate requested. OpenCL detected.
[30/Apr/2024:10:27:01 -07:00: profiler/debug] [job_id: jygz19nxp] [model.tflite] Enabling delegate cache in dir=/data/user/0/ai.tetra.tungsten/cache/1714498015468/ai.tetra.runtime/0.6.0/model.tflite_8945450969824422876_1714498006500/gpuv2.
[30/Apr/2024:10:27:01 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] [tflite] Replacing 2003 out of 2003 node(s) with delegate (TfLiteGpuDelegateV2) node, yielding 1 partitions for the whole graph.
[30/Apr/2024:10:27:02 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] [tflite] Found serialized data for model gpuv2 (175507208 B) at /data/user/0/ai.tetra.tungsten/cache/1714498015468/ai.tetra.runtime/0.6.0/model.tflite_8945450969824422876_1714498006500/gpuv2/gpuv2_1297717803319390986.bin
[30/Apr/2024:10:27:02 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] [tflite] Initialized OpenCL-based API from serialized data.
[30/Apr/2024:10:27:02 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] [tflite] Created 1 GPU delegate kernels.
[30/Apr/2024:10:27:02 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Applied 1 delegates: GPUV2/OpenCL. Model is fully delegated=true.
[30/Apr/2024:10:27:02 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Malloc VM size after: 252240.0 kB, allocated: 225091.0 kB, slack: 27149.0 kB.
[30/Apr/2024:10:27:02 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Status Successfully Loaded Warm with t = 1281645 us and usage: before = 86908.0 kB; peakBefore = 86908.0 kB; mallocUnusedBefore = 61175.6 kB; after = 283312.0 kB; peakAfter = 785988.0 kB; mallocUnusedAfter = 27149.0 kB; increase = 169255.0-230430.6 kB; peak = 699080.0-760255.6 kB
[30/Apr/2024:10:27:03 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Saving results to /storage/emulated/0/Android/data/ai.tetra.tungsten/files/Results/job_jygz19nxp/job_jygz19nxp_results.bin
[30/Apr/2024:10:27:03 -07:00: profiler/info] -=- Tungsten Running Task: Performing inference by layer -=-
[30/Apr/2024:10:27:03 -07:00: profiler/info] Detected chipset 3101, made by 3000.
[30/Apr/2024:10:27:03 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Loading previously saved results in /storage/emulated/0/Android/data/ai.tetra.tungsten/files/Results/job_jygz19nxp/job_jygz19nxp_results.bin
[30/Apr/2024:10:27:03 -07:00: profiler/debug] [job_id: jygz19nxp] [model.tflite] Starting profiler
[30/Apr/2024:10:27:03 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Loading tflite model Models/model.tflite
[30/Apr/2024:10:27:03 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Malloc VM size before: 77880.0 kB, allocated: 16961.8 kB, slack: 60918.2 kB.
[30/Apr/2024:10:27:03 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Current memory baseline range: 45341.8-106260.0 kB.
[30/Apr/2024:10:27:03 -07:00: profiler/debug] [job_id: jygz19nxp] [model.tflite] Runtime metadata not found in Models/model.tflite/trt_metadata.json or Models/model.tflite/trt_metadata.pb
[30/Apr/2024:10:27:03 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] TF Lite version 2.16.1. Loading model from Models/model.tflite.
[30/Apr/2024:10:27:03 -07:00: profiler/debug] [job_id: jygz19nxp] [model.tflite] Mapping resource file in Models/model.tflite
[30/Apr/2024:10:27:03 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Loaded model. Minimum TF Lite version = 2.3.0.
[30/Apr/2024:10:27:03 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] GPUV2 delegate requested. OpenCL detected.
[30/Apr/2024:10:27:03 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] [tflite] Replacing 2003 out of 2003 node(s) with delegate (TfLiteGpuDelegateV2) node, yielding 1 partitions for the whole graph.
[30/Apr/2024:10:27:07 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] [tflite] Created 1 GPU delegate kernels.
[30/Apr/2024:10:27:07 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Applied 1 delegates: GPUV2/OpenCL. Model is fully delegated=true.
[30/Apr/2024:10:27:07 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Malloc VM size after: 259724.0 kB, allocated: 243057.6 kB, slack: 16666.4 kB.
[30/Apr/2024:10:27:07 -07:00: profiler/info] [job_id: jygz19nxp] [model.tflite] Status Successfully Loaded Warm with t = 3769509 us and usage: before = 106260.0 kB; peakBefore = 106260.0 kB; mallocUnusedBefore = 60918.2 kB; after = 300360.0 kB; peakAfter = 635204.0 kB; mallocUnusedAfter = 16666.4 kB; increase = 177433.6-238351.8 kB; peak = 528944.0-589862.2 kB

The process ended because of a segmentation fault. Consult the runtime log for more details.
The following is the suspected stack trace.
 * ? (/vendor/lib64/egl/libGLES_mali.so)
 * ? (/vendor/lib64/egl/libGLES_mali.so)
 * ? (/vendor/lib64/egl/libGLES_mali.so)
 * ? (/vendor/lib64/egl/libGLES_mali.so)
 * ? (/vendor/lib64/egl/libGLES_mali.so)
 * clEnqueueNDRangeKernel (/vendor/lib64/egl/libGLES_mali.so)
 * tflite::gpu::cl::CLCommandQueue::Dispatch() (/data/app/~~58txPSlH_K9lY1T48CQEEw==/ai.tetra.tungsten-qGxzJmjd0hwWY5NXpr8n5A==/lib/arm64/libtungsten-native-bridge.so)
 * tflite::gpu::cl::ProfilingCommandQueue::DispatchNTimes() (/data/app/~~58txPSlH_K9lY1T48CQEEw==/ai.tetra.tungsten-qGxzJmjd0hwWY5NXpr8n5A==/lib/arm64/libtungsten-native-bridge.so)
 * tflite::gpu::cl::InferenceContext::ProfileTime() (/data/app/~~58txPSlH_K9lY1T48CQEEw==/ai.tetra.tungsten-qGxzJmjd0hwWY5NXpr8n5A==/lib/arm64/libtungsten-native-bridge.so)
 * tflite::gpu::cl::InferenceContext::Profile() (/data/app/~~58txPSlH_K9lY1T48CQEEw==/ai.tetra.tungsten-qGxzJmjd0hwWY5NXpr8n5A==/lib/arm64/libtungsten-native-bridge.so)
 * ? (/data/app/~~58txPSlH_K9lY1T48CQEEw==/ai.tetra.tungsten-qGxzJmjd0hwWY5NXpr8n5A==/lib/arm64/libtungsten-native-bridge.so)
 * ? (/data/app/~~58txPSlH_K9lY1T48CQEEw==/ai.tetra.tungsten-qGxzJmjd0hwWY5NXpr8n5A==/lib/arm64/libtungsten-native-bridge.so)
 * ? (/data/app/~~58txPSlH_K9lY1T48CQEEw==/ai.tetra.tungsten-qGxzJmjd0hwWY5NXpr8n5A==/lib/arm64/libtungsten-native-bridge.so)
 * tflite::Subgraph::InvokeImpl() (/data/app/~~58txPSlH_K9lY1T48CQEEw==/ai.tetra.tungsten-qGxzJmjd0hwWY5NXpr8n5A==/lib/arm64/libtungsten-native-bridge.so)
 * tflite::Subgraph::Invoke() (/data/app/~~58txPSlH_K9lY1T48CQEEw==/ai.tetra.tungsten-qGxzJmjd0hwWY5NXpr8n5A==/lib/arm64/libtungsten-native-bridge.so)
 * tflite::impl::Interpreter::Invoke() (/data/app/~~58txPSlH_K9lY1T48CQEEw==/ai.tetra.tungsten-qGxzJmjd0hwWY5NXpr8n5A==/lib/arm64/libtungsten-native-bridge.so)
 * backend::tflite::TfLiteModel::Run() (/data/app/~~58txPSlH_K9lY1T48CQEEw==/ai.tetra.tungsten-qGxzJmjd0hwWY5NXpr8n5A==/lib/arm64/libtungsten-native-bridge.so)
 * tungsten::Profiler::ProfileOrValidate() (/data/app/~~58txPSlH_K9lY1T48CQEEw==/ai.tetra.tungsten-qGxzJmjd0hwWY5NXpr8n5A==/lib/arm64/libtungsten-native-bridge.so)
 * tungsten::ProfilerRunner::ProfileModels() (/data/app/~~58txPSlH_K9lY1T48CQEEw==/ai.tetra.tungsten-qGxzJmjd0hwWY5NXpr8n5A==/lib/arm64/libtungsten-native-bridge.so)
 * tungsten::ProfilerRunner::RunTask() (/data/app/~~58txPSlH_K9lY1T48CQEEw==/ai.tetra.tungsten-qGxzJmjd0hwWY5NXpr8n5A==/lib/arm64/libtungsten-native-bridge.so)
 * Java_ai_tetra_tungsten_ProfilerRunner_profileModels (/data/app/~~58txPSlH_K9lY1T48CQEEw==/ai.tetra.tungsten-qGxzJmjd0hwWY5NXpr8n5A==/lib/arm64/libtungsten-native-bridge.so)
 * ? (/apex/com.android.art/lib64/libart.so)
 * ? (/apex/com.android.art/lib64/libart.so)
 * ? (/apex/com.android.art/lib64/libart.so)
gaikwadrahul8 commented 2 days ago

This issue originally reported by @gustavla has been moved to this dedicated repository for LiteRT to enhance issue tracking and prioritization. To ensure continuity, we have created this new issue on your behalf.

We appreciate your understanding and look forward to your continued involvement.