canonical / checkbox

Checkbox
https://checkbox.readthedocs.io
GNU General Public License v3.0
31 stars 46 forks source link

RTC operation not permitted causes suspend/suspend_advanced_auto failed sometime #1211

Open baconYao opened 4 months ago

baconYao commented 4 months ago

Bug Description

I found the suspend failed sometimes because of "rtcwake: /dev/rtc0: unable to find device: Operation not permitted".

This problem was observed before, See Checkbox issue: https://github.com/canonical/checkbox/issues/857

At that moment, This problem could reproduced easily on G700 UC22 image. And there was a PR to fix this issue: https://github.com/canonical/checkbox/pull/979

However, I can still run into this problem sometime, not 100% on G1200-evk

[Failure Rate] 60% (3/5)

[Test Submissions]

To Reproduce

  1. Install checkbox22 via command $ sudo snap install checkbox22 --beta
  2. Install checkbox via command $ sudo snap install checkbox --channel="uc22/beta" --devmode
  3. Install bluez via command $ sudo snap install bluez
  4. Connect checkbox and bluez via command $sudo snap connect checkbox:bluez bluez:service
  5. Reboot DUT and make sure there's no connection such as SSH, serial console to it.
  6. Start testing via Checkbox Control on Host $ checkbox.checkbox-cli control
  7. Choose IoT Client Certification for 22.04 classic images (Automated Tests) test plan
  8. Choose bluetooth/detect-output and suspend/suspend_advanced_auto jobs
  9. Execute testing

Environment

Image: Ubuntucore Checkbox: Snap Version:

ceqa@ubuntu:~$ snap list
Name Version Rev Tracking Publisher Notes
bluez 5.64-5 368 22/stable canonical✓ -
checkbox-baoshan 0.5dev-jammy 128 latest/edge ce-certification-qa devmode
checkbox-ce-oem 1.0-jammy 209 latest/edge ce-certification-qa devmode
checkbox22 4.0.0-dev226 886 latest/beta ce-certification-qa -

CID: 202307-31864, 202307-31859 Image: genio-core-22-20240418-101.img kernel-version: 5.15.0-1030-mtk dtbo: display-dsilvds.dtbo

Relevant log output

No response

Additional context

No response

syncronize-issues-to-jira[bot] commented 4 months ago

Thank you for reporting us your feedback!

The internal ticket has been created: https://warthogs.atlassian.net/browse/CHECKBOX-1429.

This message was autogenerated

baconYao commented 2 months ago

Same issue can still be reproduced.

Hook25 commented 2 months ago

Can you try to run a few times in a console this command:

$ rtcwake -m no -s 30

just to see if the problem could be an issue with the device. I'm unable to connect to it right now.

After each run also check that

$ rtcwake show

gives you the new timer

baconYao commented 2 months ago

Hi @Hook25, I don't think it's problem of rtc because of the following two reasons

  1. Same problem can be reproduced on bluetooth/detect-output job
  2. Device can pass suspend stress
pieqq commented 1 month ago

I'm wondering if this could be an account problem. Checkbox remote is run as root, but then executes commands as the "default" user. You can specify the default user in the launcher, but if it's not specified, Checkbox uses some heuristics to try to figure out the user, or defaults to using ubuntu.

I just learned that these installs are using an account that is not ubuntu, so it might lead to issues. Will investigate this as part of next pulse.

baconYao commented 1 month ago

HI Peir, I could reproduce this issue even I add the agent in launcher.

Failed Submission

Steps

  1. Boot and login into device, the account is ceqa
  2. Bring the hci0 up because it's DOWN by default via command $ sudo hciconfig hci0 up
  3. Run checkbox via control (remote) like `$ checkbox.checkbox-cli
  4. Choose com.canonical.qa.baoshan::genio-baoshan-core-22-automated test plan, unmark the Boashan Thermal Set (Do not run this set since it will reboot DUT causes hci0 DOWN )

Launcher of G1200

#!/usr/bin/env checkbox-cli-wrapper
[launcher]
app_id = com.canonical.qa.baoshan:checkbox
launcher_version = 1
stock_reports = text, submission_files, certification

[test plan]
unit = com.canonical.qa.baoshan::genio-baoshan-core-22
filter = com.canonical.qa.baoshan::genio-baoshan-core-22
        com.canonical.qa.baoshan::genio-baoshan-core-22-manual
        com.canonical.qa.baoshan::genio-baoshan-core-22-automated
        com.canonical.qa.baoshan::genio-baoshan-core-22-stress

[agent]
normal_user = ceqa

[manifest]
com.canonical.certification::has_ethernet_adapter = true
com.canonical.certification::has_i2c = true
com.canonical.certification::has_card_reader = true
com.canonical.certification::has_audio_capture = true
com.canonical.certification::has_audio_playback = true
com.canonical.certification::has_hardware_watchdog = true
com.canonical.certification::socket_can_echo_server_running = false
com.canonical.certification::has_bt_adapter = true
com.canonical.certification::has_bt_smart = true
com.canonical.certification::has_thunderbolt3 = false
com.canonical.certification::has_tpm2_chip = false
com.canonical.certification::has_usb_storage = true
com.canonical.certification::has_usbc_video = true
com.canonical.certification::has_usbc_data = true
com.canonical.certification::has_usbc_otg = true
com.canonical.certification::has_wlan_adapter = true
com.canonical.certification::has_wwan_module = false
com.canonical.certification::has_hdmi = true
com.canonical.certification::has_dp = false
com.canonical.certification::has_vga = false
com.canonical.certification::has_dvi = false
com.canonical.certification::has_touchscreen = true
com.canonical.certification::has_touchpad = false
com.canonical.certification::has_eeprom = false
com.canonical.certification::need_snapd_snap_update_test = true
com.canonical.qa.baoshan::has_baoshan_amic = true
com.canonical.qa.baoshan::has_baoshan_dmic = true
com.canonical.qa.baoshan::has_baoshan_pcm = true
com.canonical.qa.baoshan::has_baoshan_g1200_j34_short = true
com.canonical.qa.baoshan::has_hdmi_rx = true
com.canonical.qa.baoshan::has_baoshan_i2s = false
com.canonical.contrib::has_otg = true
com.canonical.contrib::has_ptp = true
com.canonical.contrib::has_socket_can_fd = true
com.canonical.contrib::has_buzzer = false
com.canonical.contrib::has_eeprom = false
com.canonical.contrib::has_gps = false
com.canonical.contrib::has_mtd = false
com.canonical.contrib::has_rs485 = false
com.canonical.contrib::has_rs485_server = false
com.canonical.contrib::has_digital_io = false
com.canonical.contrib::has_button = false
com.canonical.contrib::has_caam = false
com.canonical.contrib::has_sa2ul_engine = false
com.canonical.contrib::has_mcrc_engine = false
com.canonical.contrib::has_optee = false
com.canonical.contrib::has_led_indicator = false
com.canonical.contrib::has_tcp_multi_connection_server = false

[environment]
WPA_BG_SSID = cert-bg-wpa-tel-l4
WPA_BG_PSK = insecure
WPA_N_SSID = cert-n-wpa-tel-l4
WPA_N_PSK = insecure
WPA_AC_SSID = cert-ac-wpa-tel-l4
WPA_AC_PSK = insecure
WPA_AX_SSID = cert-ax-wpa-tel-l4
WPA_AX_PSK = insecure
WPA3_AX_SSID = cert-ax-wpa3-tel-l4
WPA3_AX_PSK = insecure
OPEN_BG_SSID = cert-bg-open-tel-l4
OPEN_N_SSID = cert-n-open-tel-l4
OPEN_AC_SSID = cert-ac-open-tel-l4
OPEN_AX_SSID = cert-ax-open-tel-l4
BTDEVADDR = 4C:80:93:CC:AC:21,7C:B2:7D:4B:14:95,34:6F:24:A8:93:EE,C4:BD:E5:51:D6:95,80:32:53:D8:0D:1E
GENIO_DEVICE=G1200-evk
GENIO_GPU_DRIVER_SNAP=mediatek-genio-g1200-gpu-drivers-core22
GPIO_LOOPBACK_PIN_MAPPING=0:18:0:0:26:100
OTG=USB-C:11200000 Micro-USB:112a1000
SNAP_CONFINEMENT_ALLOWLIST=genio-test-tool,mediatek-genio-g1200-gpu-drivers-core22
HWRNG=1020f000.rng
MODEL_GRADE=signed
baconYao commented 3 weeks ago

Another easy way to reproduce Operation not permitted is to run com.canonical.certification::zapper-enabled-automated test plan. Notice, you have to up hci0 interface before executing via command bluez.hciconfig hci0 up

Submission: https://certification.canonical.com/hardware/202307-31859/submission/386803/test-results/?term=bluetooth%2Fdetect-output

pieqq commented 2 weeks ago

I've tried reproducing your issue on a G700, and I cannot.

I followed the steps you provided in your last comment, and I see this:

-----------------------------[ Running job 4 / 6 ]------------------------------
--------------------------[ bluetooth/detect-output ]---------------------------
ID: com.canonical.certification::bluetooth/detect-output
Category: Bluetooth tests
--------------------------------------------------------------------------------
2C:3B:70:3F:D2:F2
--------------------------------------------------------------------------------
Outcome: job passed

(by the way, this device is stuck on snapd 2.63 which has a lot of issues, making running Checkbox a pain, and should be updated to a newer version)

pieqq commented 2 weeks ago

One thing I noticed, though, is that when the checkbox agent service starts, it detects the proper user:

Aug 22 13:23:39 ubuntu systemd[1]: Started Service for snap application checkbox-baoshan.remote-slave.
Aug 22 13:23:54 ubuntu checkbox-baoshan.remote-slave[1292]: WARNING:root:slave is deprecated and will be removed in the next major release of Checkbox. Please use run-agent instead
Aug 22 13:23:54 ubuntu checkbox-baoshan.remote-slave[1292]: WARNING:plainbox.providers.__init__:Using sideloaded provider: checkbox-provider-base, version 4.1.0.dev25 from /var/tmp/ch>
Aug 22 13:23:55 ubuntu checkbox-baoshan.remote-slave[1292]: WARNING:plainbox.providers.__init__:Using sideloaded provider: checkbox-provider-base, version 4.1.0.dev25 from /var/tmp/ch>
Aug 22 13:25:41 ubuntu checkbox-baoshan.remote-slave[1292]: WARNING:plainbox.providers.__init__:Using sideloaded provider: checkbox-provider-base, version 4.1.0.dev25 from /var/tmp/ch>
Aug 22 13:25:41 ubuntu checkbox-baoshan.remote-slave[1292]: WARNING:checkbox_ng.user_utils:Using `ceqa` user
(...)

So if you haven't touched anything in the default launchers, then my theory that it might be related to the user dection heuristic in Checkbox is wrong.

pieqq commented 2 weeks ago

I've now tried running the zapper-enabled-automated test plan on all the devices @baconYao suggested (G510, G700, G1200), and each time, the test passes:

-----------------------------[ Running job 4 / 7 ]------------------------------
--------------------------[ bluetooth/detect-output ]---------------------------
ID: com.canonical.certification::bluetooth/detect-output
Category: Bluetooth tests
--------------------------------------------------------------------------------
10:68:38:D7:E2:AC
--------------------------------------------------------------------------------
Outcome: job passed

I'm using the latest Checkbox22 snap from beta channel:

checkbox-baoshan                         0.5dev-jammy    140    latest/edge    ce-certification-qa  devmode
checkbox-ce-oem                          1.0-jammy       431    uc22/edge      ce-certification-qa  devmode
checkbox22                               4.2.0-dev9      1057   latest/beta    ce-certification-qa  -

I don't know how to investigate things further.

pieqq commented 1 week ago

I was informed that another device (amd64 this time, not arm64) was having what looks like similar issues: #1435