Open EricAtPlanGrid opened 2 years ago
Hello, thanks for filing this issue!
This part of the validation process can be non-deterministic. Please retry while making sure that, the screen remains unlocked and the sample application remains in focus. If this continues to persists, please upload the trace file (from the 'View trace file' link)
Hi @ttanatb, I can consistently repro this as well (S21+) with the screen unlocked and the sample app visible until the validation fails. Here's my trace:
Looking through the logcat it looks like there's an issue with getting counter values from the driver as everything is set to 0 in the perfetto trace.
Are you by chance using a custom build or a preview build of the OS? Or was the validation working correctly earlier and only recently started breaking?
Thanks @ttanatb. It's just the normal official OS build 🤔 I also tried AGI 2.1.0, which failed, too. That definitely worked a month or two ago on this device, so I suppose the only thing that could have changed is an OS update, the last of which I installed on May 17. Is there a way to get in touch with Samsung to see if something changed with the driver?
It looks like there was a breaking change in the driver recently (not sure if it was included in the May 17 update or the one before that). Unfortunately, reverting the AGI version will not fix this issue, and I don't have an easy workaround for this at the moment. We're working on getting this sorted out and will update once we have something.
Apologies for the delay, here's a workaround to get around issues with getting GPU counters on Adreno GPUs:
adb shell "echo 1 > /sys/class/kgsl/kgsl-3d0/perfcounter"
It's worth noting that you may want to set the contents of /sys/class/kgsl/kgsl-3d0/perfcounter
back to 0
as a precaution after profiling, so that other apps/processes cannot access GPU counters.
I'm working on a change that runs these commands before profiling, will notify once it's in the new release.
I have also the same problem on a Galaxy Tab S8 (SM-X700, Android 12, OneUI 4.1, build SP1A.210812.016.X700XXU2AVF5).
I can't apply the workaround since the device is not rooted (and I don't want to) and /sys/class/kgsl/kgsl-3d0/perfcounter
is only writable by root.
So I guess I'll have to wait for driver update...
I see, I've verified that /sys/class/kgsl/kgsl-3d0/perfcounter
does not require root access on pixel devices, will try to contact folks on Samsung to try to resolve this, but cannot comment on the timeline or feasibility of this. Apologies for the breakage and lack of convenient workaround, but we're working to ensure that we don't end up with a breaking change like this in the future.
Adding in here, ASUS devices (ROG 6 and Zenfone 9) are also affected.
Hi @JakeDaynes have you tried this with the latest dev release https://github.com/google/agi-dev-releases/releases/tag/v3.1.0-dev-20220830 and just to confirm, are you able to run adb shell "echo 1 > /sys/class/kgsl/kgsl-3d0/perfcounter"
or is the access to the file gated by root (instead of shell)?
If the file access is set to root, we'll have to ask ASUS to release a patch to fix this
@ttanatb Sorry for any confusion - I was more commenting on the fact that the ASUS devices are affected by the kgsl-3d0/perfcounter file being gated by root. I've gotten in contact with ASUS over the issue, but having Google reach out and push for a patch would help.
Thanks for the clarification -- just wanted to make sure because I couldn't get a hold of an ASUS device to repro. I'm working with the rest of the team to push for a resolution on this
The team is also working on efforts to prevent this in the future by tightening our test suites to ensure that a patch will not break our product like this in the future. Apologies for the inconvenience
@JakeDaynes Could you provide your trace file (the *.perfetto
file that is linked in the trace dialog) for me to double-check as well? Would like to avoid case in which it happens to be something else -- Thanks!
All good - kgsl drivers/counters have been a nightmare since A7, so wasn't really surprised :)
I'm actually not using AGI (I've got my own IOCTL communication to the driver), but I've confirmed that Snapdragon Profiler is also blocked by this root gate issue as well
Just got an update on my Samsung tablet (SM-X700, Android 12, OneUI 4.1.1, build SP2A.220305.013.X700XXU2AVH4) and /sys/class/kgsl/kgsl-3d0/perfcounter
is now writable by shell group!
Device validation now passes after enabling counters :smiley:
Apologies for the delay, here's a workaround to get around issues with getting GPU counters on Adreno GPUs:
adb shell "echo 1 > /sys/class/kgsl/kgsl-3d0/perfcounter"
It's worth noting that you may want to set the contents of
/sys/class/kgsl/kgsl-3d0/perfcounter
back to0
as a precaution after profiling, so that other apps/processes cannot access GPU counters.I'm working on a change that runs these commands before profiling, will notify once it's in the new release.
Working like a charm. My Note S20 (Qualcomm Adreno (TM) 650, Android 12) did not pass the validation test before using this command.
But another error comes out after this. (Image below)
Can this be related to this problem?
@Cliwo that error looks like your adb connection might've been severed (i.e. faulty cable or locked phone screen), if the issue continues to persist (i.e. not a flakey issue) please file a separate issue.
I'm closing this for now because the update seems to have landed and we can finally look at wiggly little GPU counter numbers again 🥳 Future versions of AGI will run the adb shell "echo 1 > /sys/class/kgsl/kgsl-3d0/perfcounter"
command for you and I'm working to hopefully make sure that future versions of Android will not have this same issue again.
Same problem with S21 FE 5G (SM-G990B2)
adb shell "echo 1 > /sys/class/kgsl/kgsl-3d0/perfcounter"
fails with /system/bin/sh: can't create /sys/class/kgsl/kgsl-3d0/perfcounter: Permission denied
CPU: Qualcomm Snapdragon 888 5G GPU: Adreno 660 Android: 13 Build: TP1A.220624.014.G990B2XXS1DVL1 UI: OneUI 5.0
Host OS: Windows 10 Enterprise 21H2 AGI version: 2022-3.2.1:d808a5ebcdca788a1bd50d105cc21f9888074d4d
Apologize for the delay in getting this resolved, I've communicated the issue with folks at Samsung and am hoping to get this resolved as soon as possible.
In the future, we've added tests in Android 14 to ensure that we don't get this kind of regression again.
Honestly a standardised interface with an available counter map would be a godsend in future versions of the OS. One of Android's basic tenants is transparency, yet over the last 10 major releases I've been doing performance profiling development, we've seen access to certain information killed or modified heavily time and time again.
Really appreciate you working towards keeping this open at least.
Thank you soo much. Is there any way to follow along with Samsung's side of this bug?
I'm not a certain of a way to track this issue publicly, but we've communicated this problem to them via internal channels previously. I can try to ping this issue again to try to understand the timeline on getting this fixed.
I don't want to spam this thread (as I spam this thread) but I do want to add my voice here to say that many of us would benefit from this being resolved! Thank you for your hard work @ttanatb et al!
Pinged the folks at Samsung -- will update this thread once I receive a response
and still, they didn't fix this!
Dare I ask, has there been any further contact with Samsung?
Any updates here @ttanatb ?
Any updates?
🍾 (1 year bump-aversary) … any news?
Environment information:
Bug description
Device validation fails:
Check failed for counter: {1 Clocks / Second 0x46e6a20}
Reproduction steps Steps to reproduce the behavior: Attempt to use AGI w/ these devices
Additional debugging information
adb logcat -d > logcat-full.txt
) that contains logs since AGI was started. gapis.log gapic.log logcat-full.txt