google / agi

Android GPU Inspector
https://gpuinspector.dev
Apache License 2.0
944 stars 138 forks source link

Failed to validate the trace of sample application with the Adreno validator #1337

Closed wzq1313741 closed 8 months ago

wzq1313741 commented 9 months ago

Environment information:

Bug description

Reproduction steps Click Reload ro Retry

Stacktrace FailedTraceValidation Error: Passed 2 checks out of 9, expected to pass 9 check(s). Failed check for 7 counter(s): [{1 Clocks / Second 0x7ff790ac1f60} {3 GPU %!U(MISSING)tilization 0x7ff790ac1f60} {21 %!S(MISSING)haders Busy 0x7ff790ac1f60} {26 Fragment ALU Instructions / Sec (Full) 0x7ff790ac1f60} {31 Textures / Fragment 0x7ff790ac1f60} {37 %!T(MISSING)ime Shading Fragments 0x7ff790ac1f60} {38 %!T(MISSING)ime Shading Vertices 0x7ff790ac1f60}]

Screenshots 20240110-114029

Additional debugging information

gapic.log

I20240110-110546026[pool-1-thread-1][server.GapiPaths.checkForTools] Looking for GAPID in C:\Program Files (x86)\agi -> C:\Program Files (x86)\agi\gapis.exe I20240110-110546034[server.ChildProcess-gapis][server.ChildProcess.runProcess] Starting gapis as [C:\Program Files (x86)\agi\gapis.exe, -enable-local-files, -crashreport, -analytics, b63d4812-6759-4395-b0cd-254a965c462a, -log-file, C:\Users\Administrator\AppData\Local\Temp\gapis.log, -log-level, Info, -gapir-args, --log C:\Users\Administrator\AppData\Local\Temp\gapir.log --log-level I, --strings, C:\Program Files (x86)\agi\strings, --gapis-auth-token, bPrWJfuZ, --idle-timeout, 60000ms, --adb, C:\Program Files\RenderDoc\plugins\android\adb.exe] I20240110-110546095[server.ChildProcess$LoggingStringHandler][server.ChildProcess$LoggingStringHandler.lambda$new$0] gapis: 11:05:46.085 I: Logging to: C:\Users\Administrator\AppData\Local\Temp\gapis.log I20240110-110546773[server.ChildProcess$LoggingStringHandler][server.ChildProcess$LoggingStringHandler.lambda$new$0] gapis: I20240110-110546774[server.ChildProcess$LoggingStringHandler][server.ChildProcess$LoggingStringHandler.lambda$new$0] gapis: Bound on port '2535' I20240110-110546774[server.ChildProcess$LoggingStringHandler][server.GapisProcess.lambda$createStdoutHandler$2] Detected gapis startup on port 2535 I20240110-110546775[pool-1-thread-2][server.GapisProcess.lambda$new$1] Established a new client connection to 2535 I20240110-110546878[pool-1-thread-1][Server.fetchServerInfo] Server info: name: "WZQ" version_major: 3 version_minor: 3 version_point: 1 server_local_device { ID { data: f69717b1e06d17defaab2f3b16e24a2db368155e } }

gapis.log

11:43:25.250 I: [ValidateDevice] Validating device (Adreno (TM) 660) with (ADRENO) validator 11:43:26.079 I: [start⇒ValidateDevice] No developer driver found, attempting to use GPU profiling libraries in system image. 11:43:26.784 I: [start⇒ValidateDevice] Unlocking device screen 11:43:27.938 I: [start⇒ValidateDevice] [launch producer] I: Trying libgpudataproducer.so 11:43:27.938 I: [start⇒ValidateDevice] [launch producer] I: Calling start at 0x7265251f48 11:43:27.963 I: [start⇒ValidateDevice] [launch producer] [725.370] tersDataProducer.cpp:37 QProfilerInterface: Initializing (v2020.11, built Sep 6 2022@05:22:23)... 11:43:27.965 I: [start⇒ValidateDevice] [launch producer] [725.371] ersDataProducer.cpp:255 gpu.counters: Starting tracing thread 11:43:27.966 I: [start⇒ValidateDevice] [launch producer] [725.373] perfetto.cc:37699 Producer connected 11:43:28.065 I: [start⇒ValidateDevice] [launch producer] [725.470] ersDataProducer.cpp:331 Start counter trace returned 0 11:43:31.089 I: [start⇒ValidateDevice] [launch producer] [728.494] perfetto.cc:38150 Setting up data source 22 gpu.counters 11:43:31.089 I: [start⇒ValidateDevice] [launch producer] [728.494] ersDataProducer.cpp:387 gpu.counters: OnSetup called, name: gpu.counters, trace_duration_ms: 7000, counter_period_ms: 50, tracing_session_id: 11, session: 0x7325520c70 11:43:31.091 I: [start⇒ValidateDevice] [launch producer] [728.495] perfetto.cc:38224 Starting data source 22 11:43:31.091 I: [start⇒ValidateDevice] [launch producer] [728.495] ersDataProducer.cpp:411 gpu.counters: OnStart called, session: 0x7325520c70 11:43:38.091 I: [start⇒ValidateDevice] [launch producer] [735.496] perfetto.cc:38245 Stopping data source 22 11:43:38.379 I: [start⇒ValidateDevice] [launch producer] [735.784] ersDataProducer.cpp:427 gpu.counters: OnStop called, session: 0x7325520c70 11:43:38.380 I: [start⇒ValidateDevice] [launch producer] [735.784] perfetto.cc:38283 Ending async stop of data source 22 11:43:38.419 I: [ValidateDevice] Perfetto trace size 503314 bytes 11:43:38.438 I: [ValidateDevice] Writing trace size 503314 bytes to C:\Users\Administrator\AppData\Local\Temp\validation3727662812.perfetto 11:43:39.021 I: [ValidateDevice] Device validation failed with error code (3) and reason: Passed 2 checks out of 9, expected to pass 9 check(s). Failed check for 7 counter(s): [{1 Clocks / Second 0x7ff790ac1f60} {3 GPU %!U(MISSING)tilization 0x7ff790ac1f60} {21 %!S(MISSING)haders Busy 0x7ff790ac1f60} {26 Fragment ALU Instructions / Sec (Full) 0x7ff790ac1f60} {31 Textures / Fragment 0x7ff790ac1f60} {37 %!T(MISSING)ime Shading Fragments 0x7ff790ac1f60} {38 %!T(MISSING)ime Shading Vertices 0x7ff790ac1f60}] 11:43:39.021 W: [start⇒ValidateDevice] Killing C:\Program Files\RenderDoc\plugins\android\adb.exe (context cancelled) 11:43:39.021 W: [ValidateDevice] Perfetto client read error: context canceled 11:43:39.021 I: [start⇒ValidateDevice] [launch producer] Exit. 11:43:39.021 E: [start⇒ValidateDevice] [EnsurePerfettoProducerLaunched] error: context canceled

ttanatb commented 8 months ago

This is not an AGI issue. It's a faulty implementation of the device's GPU profiling functionality. Please report it to the device manufacturer if you want this fixed. In the meantime you can run AGI with the --skip-device-validation flag to bypass this, but there may be other issues with profiling.