opinsys / puavo

Common placeholder project for all Puavo related projects to handle issues in one place
0 stars 0 forks source link

Shutdown problems with HP Compaq dc7900 Ultra-Slim Desktop #164

Closed asokero closed 9 years ago

asokero commented 10 years ago

It seems that this device model does not shutdown. Shutdown begins but it gets stuck with Opinsys -boot splash rolling for ever.

[] adm-asokero@lan-2114-ope:~$ lspci
00:00.0 Host bridge: Intel Corporation 4 Series Chipset DRAM Controller (rev 03)
00:02.0 VGA compatible controller: Intel Corporation 4 Series Chipset Integrated Graphics Controller (rev 03)
00:02.1 Display controller: Intel Corporation 4 Series Chipset Integrated Graphics Controller (rev 03)
00:03.0 Communication controller: Intel Corporation 4 Series Chipset HECI Controller (rev 03)
00:03.2 IDE interface: Intel Corporation 4 Series Chipset PT IDER Controller (rev 03)
00:03.3 Serial controller: Intel Corporation 4 Series Chipset Serial KT Controller (rev 03)
00:19.0 Ethernet controller: Intel Corporation 82567LM-3 Gigabit Network Connection (rev 02)
00:1a.0 USB controller: Intel Corporation 82801JD/DO (ICH10 Family) USB UHCI Controller #4 (rev 02)
00:1a.1 USB controller: Intel Corporation 82801JD/DO (ICH10 Family) USB UHCI Controller #5 (rev 02)
00:1a.2 USB controller: Intel Corporation 82801JD/DO (ICH10 Family) USB UHCI Controller #6 (rev 02)
00:1a.7 USB controller: Intel Corporation 82801JD/DO (ICH10 Family) USB2 EHCI Controller #2 (rev 02)
00:1b.0 Audio device: Intel Corporation 82801JD/DO (ICH10 Family) HD Audio Controller (rev 02)
00:1c.0 PCI bridge: Intel Corporation 82801JD/DO (ICH10 Family) PCI Express Port 1 (rev 02)
00:1c.4 PCI bridge: Intel Corporation 82801JD/DO (ICH10 Family) PCI Express Port 5 (rev 02)
00:1d.0 USB controller: Intel Corporation 82801JD/DO (ICH10 Family) USB UHCI Controller #1 (rev 02)
00:1d.1 USB controller: Intel Corporation 82801JD/DO (ICH10 Family) USB UHCI Controller #2 (rev 02)
00:1d.2 USB controller: Intel Corporation 82801JD/DO (ICH10 Family) USB UHCI Controller #3 (rev 02)
00:1d.7 USB controller: Intel Corporation 82801JD/DO (ICH10 Family) USB2 EHCI Controller #1 (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev a2)
00:1f.0 ISA bridge: Intel Corporation 82801JDO (ICH10DO) LPC Interface Controller (rev 02)
00:1f.2 IDE interface: Intel Corporation 82801JD/DO (ICH10 Family) 4-port SATA IDE Controller (rev 02)
00:1f.5 IDE interface: Intel Corporation 82801JD/DO (ICH10 Family) 2-port SATA IDE Controller (rev 02)```

[] adm-asokero@lan-2114-ope:~$ sudo lshw
lan-2114-ope              
    description: Space-saving Computer
    product: HP Compaq dc7900 Ultra-Slim Desktop (KP722AV)
    vendor: Hewlett-Packard
    serial: CZC91875XH
    width: 32 bits
    capabilities: smbios-2.5 dmi-2.5 smp-1.4 smp
    configuration: administrator_password=disabled boot=normal chassis=space-saving cpus=2 family=103C_53307F power-on_password=disabled sku=KP722AV uuid=A559519F-E63E-DE11-BBDA-818FCE810024
  *-core
       description: Motherboard
       product: 3033h
       vendor: Hewlett-Packard
       physical id: 0
       serial: CZC91875XH
     *-firmware
          description: BIOS
          vendor: Hewlett-Packard
          physical id: 1
          version: 786G1 v01.16
          date: 03/05/2009
          size: 128KiB
          capacity: 4032KiB
          capabilities: pci pnp upgrade shadowing cdboot bootselect edd int13floppytoshiba int13floppy720 int5printscreen int9keyboard int14serial int17printer acpi usb ls120boot zipboot biosbootspecification netboot
     *-cpu:0
          description: CPU
          product: Pentium(R) Dual-Core  CPU      E5200  @ 2.50GHz
          vendor: Intel Corp.
          physical id: 5
          bus info: cpu@0
          version: 6.7.6
          serial: 0001-0676-0000-0000-0000-0000
          slot: XU1 PROCESSOR
          size: 2500MHz
          capacity: 2500MHz
          width: 64 bits
          clock: 800MHz
asokero commented 10 years ago
[    0.000000] ------------[ cut here ]------------
[    0.000000] WARNING: CPU: 0 PID: 0 at /tmp/opinsys-linux/drivers/iommu/dmar.c:488 warn_invalid_dmar+0x8d/0xa0()
[    0.000000] Your BIOS is broken; DMAR reported at address fed90000 returns all ones!
[    0.000000] BIOS vendor: Hewlett-Packard; Ver: 786G1 v01.16; Product Version:  
[    0.000000] Modules linked in:
[    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 3.13.0-36-generic #63+opinsys1
[    0.000000] Hardware name: Hewlett-Packard HP Compaq dc7900 Ultra-Slim Desktop/3033h, BIOS 786G1 v01.16 03/05/2009
[    0.000000]  00000000 00000000 c1917e88 c1653137 c1917ec8 c1917eb8 c105696e c189ff0c
[    0.000000]  c1917ee8 00000000 c189ff74 000001e8 c154eadd c154eadd c1b90010 c1b90060
[    0.000000]  ffe15c5f c1917ed0 c1056a02 0000000b c1917ec8 c189ff0c c1917ee8 c1917f14
[    0.000000] Call Trace:
[    0.000000]  [<c1653137>] dump_stack+0x41/0x52
[    0.000000]  [<c105696e>] warn_slowpath_common+0x7e/0xa0
[    0.000000]  [<c154eadd>] ? warn_invalid_dmar+0x8d/0xa0
[    0.000000]  [<c154eadd>] ? warn_invalid_dmar+0x8d/0xa0
[    0.000000]  [<c1056a02>] warn_slowpath_fmt_taint+0x32/0x40
[    0.000000]  [<c154eadd>] warn_invalid_dmar+0x8d/0xa0
[    0.000000]  [<c1a07ac9>] check_zero_address+0x112/0x12e
[    0.000000]  [<c1a07af7>] detect_intel_iommu+0x12/0x78
[    0.000000]  [<c19c165e>] pci_iommu_alloc+0x3b/0x5a
[    0.000000]  [<c19cf39d>] mem_init+0xe/0x205
[    0.000000]  [<c164ec29>] ? printk+0x50/0x52
[    0.000000]  [<c19e1745>] ? page_cgroup_init_flatmem+0x6d/0x98
[    0.000000]  [<c19b790d>] start_kernel+0x1cf/0x3b9
[    0.000000]  [<c19b7575>] ? repair_env_string+0x51/0x51
[    0.000000]  [<c19b739c>] i386_start_kernel+0x137/0x13a
[    0.000000] ---[ end trace 784f413cf074728c ]---
tuomasjjrasanen commented 9 years ago

Unfortunately, I cannot reproduce the problem. I have been booting and shutting it down from various states with various mechanisms and it has shutdown properly every time.

tuomasjjrasanen commented 9 years ago

BIOS version: 786G1 v01.22

tuomasjjrasanen commented 9 years ago

Experienced this on one boot:

[    2.235352] [drm] Initialized drm 1.1.0 20060810
[    2.253169] wmi: Mapper loaded
[    2.273663] [drm] Memory usable by graphics device = 2048M
[    2.273796] ------------[ cut here ]------------
[    2.273820] WARNING: CPU: 0 PID: 115 at /tmp/opinsys-linux/drivers/gpu/drm/i915/intel_opregion.c:266 swsci+0x2c9/0x2e0 [i915]()
[    2.273821] excessive driver sleep timeout (DSPL) 167772160
[    2.273822] Modules linked in: i915(+) floppy(+) wmi video i2c_algo_bit drm_kms_helper pps_core drm
[    2.273830] CPU: 0 PID: 115 Comm: systemd-udevd Tainted: G          I   3.13.0-36-generic #63+opinsys1
[    2.273831] Hardware name: Hewlett-Packard HP Compaq dc7900 Ultra-Slim Desktop/3033h, BIOS 786G1 v01.22 08/25/2009
[    2.273833]  00000000 00000000 f75fdaa0 c1653137 f75fdae0 f75fdad0 c105696e f8906dac
[    2.273838]  f75fdafc 00000073 f8906dd8 0000010a f88e8129 f88e8129 f842c940 f74cbc00
[    2.273842]  000001f4 f75fdae8 c10569c3 00000009 f75fdae0 f8906dac f75fdafc f75fdb20
[    2.273846] Call Trace:
[    2.273852]  [<c1653137>] dump_stack+0x41/0x52
[    2.273856]  [<c105696e>] warn_slowpath_common+0x7e/0xa0
[    2.273878]  [<f88e8129>] ? swsci+0x2c9/0x2e0 [i915]
[    2.273900]  [<f88e8129>] ? swsci+0x2c9/0x2e0 [i915]
[    2.273903]  [<c10569c3>] warn_slowpath_fmt+0x33/0x40
[    2.273925]  [<f88e8129>] swsci+0x2c9/0x2e0 [i915]
[    2.273947]  [<f88e8e85>] ? intel_opregion_setup+0xa5/0x3c0 [i915]
[    2.273950]  [<c104a1cf>] ? ioremap_cache+0x1f/0x30
[    2.273972]  [<f88e8fc1>] intel_opregion_setup+0x1e1/0x3c0 [i915]
[    2.273994]  [<f88decd2>] ? intel_setup_gmbus+0x212/0x270 [i915]
[    2.274009]  [<f8881663>] i915_driver_load+0x523/0xdd0 [i915]
[    2.274021]  [<f8509cbb>] drm_dev_register+0x8b/0x1a0 [drm]
[    2.274030]  [<f850b7ae>] drm_get_pci_dev+0x7e/0x120 [drm]
[    2.274034]  [<c11dd817>] ? sysfs_do_create_link_sd.isra.2+0xa7/0x1c0
[    2.274048]  [<f887e58a>] i915_pci_probe+0x3a/0x80 [i915]
[    2.274051]  [<c13321df>] pci_device_probe+0x6f/0xc0
[    2.274053]  [<c11dd955>] ? sysfs_create_link+0x25/0x40
[    2.274057]  [<c140d463>] driver_probe_device+0x93/0x3a0
[    2.274060]  [<c1332132>] ? pci_match_device+0xb2/0xc0
[    2.274062]  [<c140d821>] __driver_attach+0x71/0x80
[    2.274064]  [<c140d7b0>] ? __device_attach+0x40/0x40
[    2.274066]  [<c140b927>] bus_for_each_dev+0x47/0x80
[    2.274069]  [<c140cf2e>] driver_attach+0x1e/0x20
[    2.274071]  [<c140d7b0>] ? __device_attach+0x40/0x40
[    2.274073]  [<c140cb87>] bus_add_driver+0x157/0x230
[    2.274075]  [<c140dde9>] driver_register+0x59/0xe0
[    2.274077]  [<c104a770>] ? alloc_pmd_page+0x50/0x50
[    2.274080]  [<c1330c12>] __pci_register_driver+0x32/0x40
[    2.274089]  [<f850b945>] drm_pci_init+0xf5/0x100 [drm]
[    2.274091]  [<f892c000>] ? 0xf892bfff
[    2.274105]  [<f892c062>] i915_init+0x62/0x64 [i915]
[    2.274108]  [<c1002122>] do_one_initcall+0xd2/0x190
[    2.274110]  [<f892c000>] ? 0xf892bfff
[    2.274113]  [<c104c96f>] ? set_memory_nx+0x5f/0x70
[    2.274115]  [<c164eec1>] ? set_section_ro_nx+0x54/0x59
[    2.274119]  [<c10c43b1>] load_module+0x1121/0x18e0
[    2.274122]  [<c10c4cd5>] SyS_finit_module+0x75/0xc0
[    2.274125]  [<c113a08b>] ? vm_mmap_pgoff+0x7b/0xa0
[    2.274129]  [<c16614cd>] sysenter_do_call+0x12/0x12
[    2.274131] ---[ end trace 439c814c5b25f3bb ]---

But it did not have any effect on the problem at hand. The device halted properly.

tuomasjjrasanen commented 9 years ago

Might be related: http://comments.gmane.org/gmane.linux.centos.general/132647

tuomasjjrasanen commented 9 years ago

BIOS update could help, but since I cannot reproduce the issue, I'd have no means to verify its effect. Leaving this issue open for now, but it requires more information on how to reproduce the issue.

asokero commented 9 years ago

This problem can now be reproduced with two different devices:

Shutdown problem seems to require that the workstation has an USB Wifi dongle connected. This has been tested with Asus USB-N53 & Telewell TW-WLAN 802.11g/n.

Reproducability was achieved with the following bootloop script by Tuomas:

#!/bin/bash                                                                                                                                                                                                         
set -eu

on_exit()
{
    set +e
    echo $n succesful boots
    exit $exitvalue
}

hostname=$1
mac=$2

exitvalue=1
n=0

trap on_exit EXIT

while true; do
    wakeonlan -i 10.249.15.255 "${mac}"
    sleep 90
    ping -c1 "${hostname}.ltsp.ORGANIZATION.opinsys.fi" || {
        echo "NO PONG, ME SAD" >&2
        exit 1
    }
    ssh "${hostname}.ltsp.toimisto.ORGANIZATION.fi" 'sudo sh -c "cat >/etc/autopoweroff.conf"; sudo restart autopoweroff' <<EOF
[NO_SHUTDOWN_TIME_RANGE]
StartHour=23
EndHour=24

[TIMEOUTS]
StartupDelay=0
IdleTime=0

[DEPENDANTS]
Hosts=
EOF
    n=$((n + 1))
    sleep 30
done

exitvalue=0
asokero commented 9 years ago

Today 270 boots without wifi adapter and no problems. Changes are high that shutdown problems are caused by the usb dongles.

asokero commented 9 years ago

Changing the Kernel didn't seem to have any effect:

juhaerk commented 9 years ago

We think this trick fixes the issue: https://github.com/opinsys/puavo-rules/commit/89d8ae315f8946ff32fd3aaebffa9666b9a22ed3