Closed NickyTope closed 3 years ago
I confirm that the latest sb3-v2 commit works fine for me.
@qzed More data about TC 0x21, (we might want to move this to a new issue in the SAM repo):
It seems to have been previously used on Surface Laptop devices:
https://github.com/linux-surface/surface-aggregator-module/issues/4#issuecomment-473236117
You can also see it here in my logs back when I was figuring out how the touchpad works (grep for 80 21), you can see it being fired off when I re-enabled the touchpad under Device Manager: https://gist.github.com/archseer/786b1b712864e909bb6f6ed567473699#file-enable-touchpad-log-L648
I think we possibly also discussed it months ago but I can't find the relevant logs on Matrix.
I got this latest update to load however I can't get any debug output..
ran sudo make dkms-uninstall
in the old directory and confirmed no more input or battery status then added +ccflags-y := -DDEBUG
after the sources += x
lines in the Makefile. I ran sudo ../scripts/ssam-modprobe -r
then again with insmod (output) below, this makes the keyboard and battery info work again but no trackpad and no debug output, am I missing something?
You need to add the ccflags-y := -DDEBUG
in the Kbuild file, not the Makefile.
@archseer Thanks for testing!
It seems to have been previously used on Surface Laptop devices:
This is a different thing. That there is some USB control communication, not via SAM.
You can also see it here in my logs back when I was figuring out how the touchpad works (grep for 80 21), you can see it being fired off when I re-enabled the touchpad under Device Manager: https://gist.github.com/archseer/786b1b712864e909bb6f6ed567473699#file-enable-touchpad-log-L648
Ah yes, that looks exactly like the ones on the SB3. Can't really remember having discussed it, at least not extensively... probably forgot that. I should probably make a list of the features/stuff we should look at and collect the logs for them.
You need to add the ccflags-y := -DDEBUG in the Kbuild file, not the Makefile.
Thanks! that did the trick, no trackpad input registering hopefully this output helps..
ssam-modprobe-insmod-load.txt ssam-modprobe-insmod-trackpad-move.txt ssam-modprobe-insmod-kb-click.txt
theres is a big pause when running the insmod
before it completes.
am running the sb3_test script now also
I forgot to filter the events by IIDs for each device, so each device got all reports... I'm not quite sure if that fixes the trackpad not working though...
If that doesn't work, can you remove every surface_sam_sid_vhf
line here except for the one with .id = 3
(trackpad)? That should enable trackpad support only, so we'll see if there are some interferences.
yes that did enable the trackpad! do you want any of the dmesg output?
theres is a big pause when running the
insmod
before it completes.
That could come from (re-)starting services.
yes that did enable the trackpad! do you want any of the dmesg output?
Which one, the commit I pushed or the fallback change?
sorry my bad, I missed the commit and went straight to the fallback.
just pulled your change and can confirm both keyboard and trackpad are working!
No worries! Awesome! I guess the trackpad HID device might have gotten a bit confused by the messages that weren't meant for it then. With that, I think we can tick off keyboard/trackpad support.
@archseer can you test the latest commit again to check that the filtering doesn't cause any issues on the SL3? If there aren't any issues on the SL3, I think we can merge that into master and update the patches.
Apart from the touchscreen, what are the other things that don't work? You mentioned the DTX driver (also the device it usually probes against is gone from ACPI), anything else?
Just tested the touchscreen now and it is no longer way off, in fact it works quite well..
Looking at the wiki page that leaves:
Pen (connected on bt, no input) SOix (how do I test this?) Sensors - not sure what this refers to? Performance modes
getting this error (maybe im jumping the gun trying to install these?)
➜ module git:(feature/sb3-v2) sudo make dkms-install
mkdir -p /usr/src/"surface_sam"-"0.1"/
cp -t /usr/src/"surface_sam"-"0.1"/ Makefile Kbuild dkms.conf surface_sam_ssh.h surface_sam_ssh.c surface_sam_san.c surface_sam_san.h surface_sam_vhf.c surface_sam_dtx.c surface_sam_hps.c surface_sam_sid.c surface_sam_sid_gpelid.c surface_sam_sid_perfmode.c surface_sam_sid_vhf.c surface_sam_sid_power.c
dkms add "surface_sam"/"0.1"
Error! DKMS tree already contains: surface_sam-0.1
You cannot add the same module/version combo more than once.
make: *** [Makefile:33: dkms-install] Error 3
SOix (how do I test this?)
You need to fetch a few values from sysfs before and after suspending the device. This script can help with that: https://gist.github.com/StollD/139779ed6aa441f661f641d83a108632
Run the script, then suspend (10 seconds should be enough). After resuming, run the script again. Compared to the first run, the values of low_power_idle_cpu_residency_us
, low_power_idle_system_residency_us
and slp_s0_residency_usec
should have increased. If thats the case, the device went into S0ix.
Pen (connected on bt, no input)
Unfortunately, getting the Pen to work on gen7 devices is not possible at the moment, because the IPTS interface changed and we don't know how to switch the device from singletouch (where the stylus is disabled) into multitouch mode. Because IPTS has almost no proper documentation, this will need a fair bit of reverse engineering from someone who owns a gen7 device.
For the rest:
/sys/bus/iio/devices
, I think. There should be an ambient light sensor, you can get the name by cat
ing the name
attribute in the device folder (should be als
for the ambient light sensor). I suspect that those won't be there as it's also currently not supported on the SL3.getting this error (maybe im jumping the gun trying to install these?)
This looks like the previous dkms install didn't fully uninstall everything (thus it thinks that the module is still there and dkms doesn't want to overwrite it). You can run dkms status
to check what's installed and what sources are present. I think with dkms uninstall
(and some parameters) you can uninstall the package.
Unfortunately, getting the Pen to work on gen7 devices is not possible at the moment, because the IPTS interface changed and we don't know how to switch the device from singletouch (where the stylus is disabled) into multitouch mode.
@StollD do we have an open issue for that? If not we should probably open one over at https://github.com/linux-surface/intel-precise-touch.
I'm falling asleep at my desk now, will try the above tomorrow and report back!
You guys are awesome!!!
I've opened a new issue for the DTX driver: https://github.com/linux-surface/surface-aggregator-module/issues/39.
@StollD do we have an open issue for that? If not we should probably open one over at https://github.com/linux-surface/intel-precise-touch.
Good idea. Opened one with all the stuff that we figured out (and that I could remember): https://github.com/linux-surface/intel-precise-touch/issues/4
@NickyTope In addition to https://github.com/linux-surface/linux-surface/issues/201#issuecomment-636870913, can you also test (separately to that, with the latest driver loaded)
./scripts/sb3_test.py kbd-disable 0x01
will disable the keyboard events (or just be silently ignored)./scripts/sb3_test.py kbd-disable 0x02
, which should disable the keyboard events.also
./scripts/sb3_test.py kbd-enable 0x01
should be ignored (as previously in the driver), and./scripts/sb3_test.py kbd-enable 0x02
should then enable the keyboard again.@archseer Would be awesome if you could test all that too.
I want to figure out right now how the event system behaves and an API around it should be implemented. On my SB2, for example, it doesn't matter if I specify priority/channel 0x01 or 0x02 (or any other value for that matter). I.e. if I enable an event with 0x02 and disable it with 0x02, it still gets disabled even though the parameters are different. On the SL3, it looks like that might be similar (as enabling the keyboard events also worked with 0x01).
will make some time to get all this done today. have moved from xubuntu now and am on manjaro with my bspwm config, very happy with the setup so far, all the above worked the same with the arch kernel and aur packages.
my apologies for this taking some time to get to...
here are my findings with the sb3 test script.
setup / teardown don't seem to have any effect.
enable / disable work (they enable and disable the keyboard events)
kbd-disable is ignored with both 0x01 and 0x02 (I had to remove the len(sys.argv)
check to get this to run)
ssam-modprobe-r.txt ssam-modprobe-insmod.txt sb3-test-setup.txt sb3-test-teardown.txt sb3-test-enable.txt sb3-test-disable.txt sb3-test-enable-after-disable.txt sb3-test-enable-keypress.txt sb3-test-kbd-disable-0x01.txt sb3-test-kbd-disable-0x02.txt
here are the names of devices in /sys/bus/iio/devices:
/sys/bus/iio/devices cat iio:device0/name
accel_3d
/sys/bus/iio/devices cat iio:device1/name
gyro_3d
/sys/bus/iio/devices cat iio:device2/name
gravity
/sys/bus/iio/devices cat iio:device3/name
dev_rotation
/sys/bus/iio/devices cat trigger0/name
accel_3d-dev0
/sys/bus/iio/devices cat trigger1/name
gyro_3d-dev1
/sys/bus/iio/devices cat trigger2/name
gravity-dev2
/sys/bus/iio/devices cat trigger3/name
dev_rotation-dev3
setting performance mode does indeed work, here is a debug log setting it to 4:
the values in the S0ix script did not update, so I guess that is unsupported at the moment also, it does suspend and hibernate perfectly well.
~/sfc-debug sudo ./s0ixdebug.sh
Intel(R) Core(TM) i7-1065G7 CPU
5.6.15-arch1-1-surface
NAME="Manjaro Linux"
PRETTY_NAME="Manjaro Linux"
/sys/devices/virtual/dmi/id/product_name:Surface Book 3
/sys/devices/virtual/dmi/id/product_sku:Surface_Book_3_1900
/sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us:0
/sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us:0
/sys/kernel/debug/pmc_core/package_cstate_show:Package C2 : 1627362071
/sys/kernel/debug/pmc_core/package_cstate_show:Package C3 : 562913523
/sys/kernel/debug/pmc_core/package_cstate_show:Package C6 : 0
/sys/kernel/debug/pmc_core/package_cstate_show:Package C7 : 0
/sys/kernel/debug/pmc_core/package_cstate_show:Package C8 : 0
/sys/kernel/debug/pmc_core/package_cstate_show:Package C9 : 0
/sys/kernel/debug/pmc_core/package_cstate_show:Package C10 : 0
/sys/kernel/debug/pmc_core/slp_s0_residency_usec:0
~/sfc-debug sudo systemctl suspend
~/sfc-debug sudo ./s0ixdebug.sh
Intel(R) Core(TM) i7-1065G7 CPU
5.6.15-arch1-1-surface
NAME="Manjaro Linux"
PRETTY_NAME="Manjaro Linux"
/sys/devices/virtual/dmi/id/product_name:Surface Book 3
/sys/devices/virtual/dmi/id/product_sku:Surface_Book_3_1900
/sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us:0
/sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us:0
/sys/kernel/debug/pmc_core/package_cstate_show:Package C2 : 1674555893
/sys/kernel/debug/pmc_core/package_cstate_show:Package C3 : 624935775
/sys/kernel/debug/pmc_core/package_cstate_show:Package C6 : 0
/sys/kernel/debug/pmc_core/package_cstate_show:Package C7 : 0
/sys/kernel/debug/pmc_core/package_cstate_show:Package C8 : 0
/sys/kernel/debug/pmc_core/package_cstate_show:Package C9 : 0
/sys/kernel/debug/pmc_core/package_cstate_show:Package C10 : 0
/sys/kernel/debug/pmc_core/slp_s0_residency_usec:0
~/sfc-debug sudo systemctl hibernate
~/sfc-debug sudo ./s0ixdebug.sh
Intel(R) Core(TM) i7-1065G7 CPU
5.6.15-arch1-1-surface
NAME="Manjaro Linux"
PRETTY_NAME="Manjaro Linux"
/sys/devices/virtual/dmi/id/product_name:Surface Book 3
/sys/devices/virtual/dmi/id/product_sku:Surface_Book_3_1900
/sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us:0
/sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us:0
/sys/kernel/debug/pmc_core/package_cstate_show:Package C2 : 4400582
/sys/kernel/debug/pmc_core/package_cstate_show:Package C3 : 1248120
/sys/kernel/debug/pmc_core/package_cstate_show:Package C6 : 0
/sys/kernel/debug/pmc_core/package_cstate_show:Package C7 : 0
/sys/kernel/debug/pmc_core/package_cstate_show:Package C8 : 0
/sys/kernel/debug/pmc_core/package_cstate_show:Package C9 : 0
/sys/kernel/debug/pmc_core/package_cstate_show:Package C10 : 0
/sys/kernel/debug/pmc_core/slp_s0_residency_usec:0
I need to also add that I cannot reboot, I did notice this in xubuntu but thought it was an unrelated crash.
reboot
ing from arch will shutdown but the system freezes at the windows logo on startup and I have to hold the power button to reset it.
shutdown -h now
works fine and the system comes back as normal.
I have refind installed now and can confirm that the freeze I saw previously in grub is not present in refind
oh, just noticed this in dmesg after resume from hibernation:
[ 2909.076559] PM: hibernation: hibernation exit
[ 2909.080710] audit: type=1130 audit(1591522597.507:469): pid=1 uid=0 auid=4294967295 ses=4294967295 subj==unconfined msg='unit=systemd-rfkill comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[ 2909.224495] audit: type=1130 audit(1591522597.651:470): pid=1 uid=0 auid=4294967295 ses=4294967295 subj==unconfined msg='unit=systemd-hibernate comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[ 2909.224498] audit: type=1131 audit(1591522597.651:471): pid=1 uid=0 auid=4294967295 ses=4294967295 subj==unconfined msg='unit=systemd-hibernate comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[ 2909.276248] audit: type=1130 audit(1591522597.704:472): pid=1 uid=0 auid=4294967295 ses=4294967295 subj==unconfined msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[ 2909.277040] iwlwifi 0000:00:14.3: Applying debug destination EXTERNAL_DRAM
[ 2909.423207] iwlwifi 0000:00:14.3: FW already configured (0) - re-configuring
[ 2909.449992] iwlwifi 0000:00:14.3: Applying debug destination EXTERNAL_DRAM
[ 2909.597587] iwlwifi 0000:00:14.3: FW already configured (0) - re-configuring
[ 2910.582440] Bluetooth: hci0: Waiting for firmware download to complete
[ 2910.582538] Bluetooth: hci0: Firmware loaded in 1499979 usecs
[ 2910.582546] Bluetooth: hci0: Waiting for device to boot
[ 2910.596850] Bluetooth: hci0: Device booted in 13976 usecs
[ 2910.596856] Bluetooth: hci0: Found Intel DDC parameters: intel/ibt-19-32-4.ddc
[ 2910.598781] Bluetooth: hci0: Applying Intel DDC parameters completed
[ 2910.600637] Bluetooth: hci0: Firmware revision 0.0 build 212 week 13 2020
[ 2914.085407] audit: type=1131 audit(1591522602.516:473): pid=1 uid=0 auid=4294967295 ses=4294967295 subj==unconfined msg='unit=systemd-rfkill comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[ 2914.583397] audit: type=1101 audit(1591522603.016:474): pid=11940 uid=1000 auid=1000 ses=2 subj==unconfined msg='op=PAM:accounting grantors=pam_unix,pam_permit,pam_time acct="nt" exe="/usr/bin/sudo" hostname=? addr=? terminal=/dev/pts/0 res=success'
[ 2914.583527] audit: type=1110 audit(1591522603.016:475): pid=11940 uid=0 auid=1000 ses=2 subj==unconfined msg='op=PAM:setcred grantors=pam_unix,pam_permit,pam_env acct="root" exe="/usr/bin/sudo" hostname=? addr=? terminal=/dev/pts/0 res=success'
[ 2914.584892] audit: type=1105 audit(1591522603.016:476): pid=11940 uid=0 auid=1000 ses=2 subj==unconfined msg='op=PAM:session_open grantors=pam_limits,pam_unix,pam_permit acct="root" exe="/usr/bin/sudo" hostname=? addr=? terminal=/dev/pts/0 res=success'
[ 2914.623491] audit: type=1101 audit(1591522603.056:477): pid=11946 uid=0 auid=1000 ses=2 subj==unconfined msg='op=PAM:accounting grantors=pam_unix,pam_permit,pam_time acct="root" exe="/usr/bin/sudo" hostname=? addr=? terminal=/dev/pts/0 res=success'
[ 2914.623727] audit: type=1110 audit(1591522603.056:478): pid=11946 uid=0 auid=1000 ses=2 subj==unconfined msg='op=PAM:setcred grantors=pam_unix,pam_permit,pam_env acct="root" exe="/usr/bin/sudo" hostname=? addr=? terminal=/dev/pts/0 res=success'
[ 2914.627179] audit: type=1105 audit(1591522603.060:479): pid=11946 uid=0 auid=1000 ses=2 subj==unconfined msg='op=PAM:session_open grantors=pam_limits,pam_unix,pam_permit acct="root" exe="/usr/bin/sudo" hostname=? addr=? terminal=/dev/pts/0 res=success'
[ 2914.631601] audit: type=1106 audit(1591522603.063:480): pid=11946 uid=0 auid=1000 ses=2 subj==unconfined msg='op=PAM:session_close grantors=pam_limits,pam_unix,pam_permit acct="root" exe="/usr/bin/sudo" hostname=? addr=? terminal=/dev/pts/0 res=success'
[ 2914.631708] audit: type=1104 audit(1591522603.063:481): pid=11946 uid=0 auid=1000 ses=2 subj==unconfined msg='op=PAM:setcred grantors=pam_unix,pam_permit,pam_env acct="root" exe="/usr/bin/sudo" hostname=? addr=? terminal=/dev/pts/0 res=success'
[ 2914.639273] audit: type=1101 audit(1591522603.070:482): pid=11948 uid=0 auid=1000 ses=2 subj==unconfined msg='op=PAM:accounting grantors=pam_unix,pam_permit,pam_time acct="root" exe="/usr/bin/sudo" hostname=? addr=? terminal=/dev/pts/0 res=success'
[ 2914.824854] iwlwifi 0000:00:14.3: Applying debug destination EXTERNAL_DRAM
[ 2914.971900] iwlwifi 0000:00:14.3: FW already configured (0) - re-configuring
[ 2918.165151] wlp0s20f3: authenticate with 60:38:e0:80:66:2c
[ 2918.168376] wlp0s20f3: send auth to 60:38:e0:80:66:2c (try 1/3)
[ 2918.191975] wlp0s20f3: 60:38:e0:80:66:2c denied authentication (status 1)
[ 2918.379189] wlp0s20f3: authenticate with 60:38:e0:80:66:2b
[ 2918.381865] wlp0s20f3: send auth to 60:38:e0:80:66:2b (try 1/3)
[ 2918.404439] wlp0s20f3: authenticated
[ 2918.407867] wlp0s20f3: associate with 60:38:e0:80:66:2b (try 1/3)
[ 2918.408996] wlp0s20f3: RX AssocResp from 60:38:e0:80:66:2b (capab=0x11 status=0 aid=6)
[ 2918.412326] wlp0s20f3: associated
[ 2918.423569] IPv6: ADDRCONF(NETDEV_CHANGE): wlp0s20f3: link becomes ready
[ 2919.565672] kauditd_printk_skb: 26 callbacks suppressed
[ 2919.565675] audit: type=1131 audit(1591522607.999:509): pid=1 uid=0 auid=4294967295 ses=4294967295 subj==unconfined msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[ 3014.892657] Uhhuh. NMI received for unknown reason 3c on CPU 0.
[ 3014.892657] Do you have a strange power saving mode enabled?
[ 3014.892657] Dazed and confused, but trying to continue
:wave: just wanted to jump in here and say thank you for working on this. I just set up everything from the sb3-v2 branch and things are working really well so far! Let me know if there's anything I can do to help out as well. SB3 i7 here (15").
reboot ing from arch will shutdown but the system freezes at the windows logo on startup and I have to hold the power button to reset it.
Adding reboot=pci
to my kernel cmdline fixed that.
my apologies for this taking some time to get to...
No worries!
here are my findings with the sb3 test script.
setup / teardown don't seem to have any effect.
I think this might only enable events for the REG
subsystem (TC=0x21). There are some events from it directly after issuing the command, which looks the same as on the IRPMon log and as mentioned then could be the events currently enabled. Not sure if there are other events being sent. We could maybe try if we can get it to send us the list on command (maybe CID=3 since that's the CID for the event).
enable / disable work (they enable and disable the keyboard events)
kbd-disable is ignored with both 0x01 and 0x02 (I had to remove the
len(sys.argv)
check to get this to run)
Interesting. So we'll need to add support for the REG
system and enable/disable the events via that. The kbd-
commands not doing anything indicate that the old way to enable/disable events isn't being used here.
I've also had a look at the Windows driver, and it seems like the old and new event enable/disable commands have, as far as I can tell (unfortunately I'm not a 100% sure here as the CID/TC values seem to be supplied by the caller and not the registry function), mostly the same code path. So that would indicate that the old and new ways don't play together, but the new is a replacement for the old.
Another thing that I've noticed is the battery. I accidentally added them via IID (instance-ID) 0 and 1, but on the SB2 those were 1 and 2 (this should pretty much have the effect that there are two battery instances created, but they will display the exact same data, i.e. the data of battery 1, which should be the one integrated into the clipboard). I've fixed that now, however, after that I had another look at the battery messages in the logs, and it looks like this is not going to work: It seems like they separated the batteries not by IID but by priority (which I think we should rename at this point, because that use kind of indicates that it's not a priority but maybe rather something like a transfer channel). That is kind of confirmed by one of the REG
events: 01 00 02 02 01
(the 02 01
at the end is likely TC=2, i.e. battery/power subsystem and IID=1). So likely, the current branch state will break the battery driver and the old state will show only battery 1, but two times... I'll try to fix that tomorrow.
Just to make sure that they haven't done some controller magic to virtually "merge" the two batteries into one for display purposes: The (extended) battery status on Windows does show two batteries?
So with regards to this, I want to try and find the answers for a couple of questions:
REG
system set-up per channel? In the detach/attach IRPMon log, the reg system is only accessed via channel 2, event messages for input devices are sent via channel 2, but event messages for the battery are sent via channel 1 and 2. So to properly enable events, do we have to set up the battery events for channel 1 by accessing the REG
system via channel 1?What are the event messages returned when enabling events for the REG
system? Could be currently enabled events or could be events that can be enabled. Part of the payload could indicate if an event has been enabled or not. Or maybe even how many times if it does internal counting. The last two bytes seem to be TC and IID. The byte before those could be the channel. Some example payloads of the event messages:
01 00 02 03 01
01 00 02 02 01
01 00 02 15 01
01 00 02 15 03
01 00 02 15 05
01 00 02 15 06
01 00 02 21 00
REG
)?I'll open up a new issue for this, otherwise the individual problems might get a bit mixed up. I'll also try to update the testing script tomorrow so that you can do a bit more testing to help figure this out, if you're up for it.
here are the names of devices in /sys/bus/iio/devices:
/sys/bus/iio/devices cat iio:device0/name accel_3d /sys/bus/iio/devices cat iio:device1/name gyro_3d /sys/bus/iio/devices cat iio:device2/name gravity /sys/bus/iio/devices cat iio:device3/name dev_rotation /sys/bus/iio/devices cat trigger0/name accel_3d-dev0 /sys/bus/iio/devices cat trigger1/name gyro_3d-dev1 /sys/bus/iio/devices cat trigger2/name gravity-dev2 /sys/bus/iio/devices cat trigger3/name dev_rotation-dev3
Looks like most of them are there. I believe the ambient light sensor is missing. That might be a custom one, probably the same or a similar one as on the SL3 (IIRC @archseer tried to figure that out on the SL3).
Edit: Opened an issue of the ALS: https://github.com/linux-surface/kernel/issues/53.
No clue about the S0ix, I honestly don't know that much about that topic. @kitakar5525 has a bit more insight into that, I believe.
The reboot thing is also an issue on the SL3 and SP7: https://github.com/linux-surface/linux-surface/issues/113. I've added a wiki entry for this: https://github.com/linux-surface/linux-surface/wiki/Surface-Book-3. No idea what it would take/how to fix that in-kernel.
Adding reboot=pci to my kernel cmdline fixed that.
welcome, and thanks this encouraged me to clean up my refind conf and get intel-ucode working! can confirm this also works for me.
the values in the S0ix script did not update, so I guess that is unsupported at the moment also, it does suspend and hibernate perfectly well.
[...] ~/sfc-debug sudo systemctl suspend ~/sfc-debug sudo ./s0ixdebug.sh Intel(R) Core(TM) i7-1065G7 CPU 5.6.15-arch1-1-surface NAME="Manjaro Linux" PRETTY_NAME="Manjaro Linux" /sys/devices/virtual/dmi/id/product_name:Surface Book 3 /sys/devices/virtual/dmi/id/product_sku:Surface_Book_3_1900 /sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us:0 /sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us:0 /sys/kernel/debug/pmc_core/package_cstate_show:Package C2 : 1674555893 /sys/kernel/debug/pmc_core/package_cstate_show:Package C3 : 624935775 /sys/kernel/debug/pmc_core/package_cstate_show:Package C6 : 0 /sys/kernel/debug/pmc_core/package_cstate_show:Package C7 : 0 /sys/kernel/debug/pmc_core/package_cstate_show:Package C8 : 0 /sys/kernel/debug/pmc_core/package_cstate_show:Package C9 : 0 /sys/kernel/debug/pmc_core/package_cstate_show:Package C10 : 0 /sys/kernel/debug/pmc_core/slp_s0_residency_usec:0 [...]
Regarding S0ix, the SB3 didn't enter deeper state than Package C3 (pc3)
. Reaching pc10 is a prerequisite for achieving S0ix, but it didn't enter even pc10.
These blog posts from Intel are useful to diagnose S0ix:
Also I'm curious about the dmesg log that contains at least one suspend/resume cycle with additional debug output:
# additional debug output
echo 1 | sudo tee /sys/power/pm_debug_messages
# go into s2idle, 10 sec is usually enough
systemctl suspend
then get the dmesg log. I like logs with log levels, so the output of dmesg -x
is helpful.
I also think it's useful to test S0ix after tuning with powertop / TLP. The default kernel configuration is more aimed towards desktops and servers rather than laptops.
one thing I noticed is that my SB3 seems to run a lot hotter in linux than in windows, even at idle and the battery saver power mode (using surface performance set
) it stays around ~50* C and is noticeably warm to the touch.
I'll experiment more and see if I can figure anything out there. @NickyTope have you noticed the same?
I've updated the battery driver. Can you guys check if it works correctly? Specifically:
cat /sys/class/power_supply/BAT0/energy_full
and
cat /sys/class/power_supply/BAT1/energy_full
Those values should be different.I have also updated the module to use the same detachment system (DTX) driver as on the SB2. Based on the logs above, the interface shouldn't have changed. Can you check if that works in combination with https://github.com/linux-surface/surface-dtx-daemon/? Specifically, detaching should be faster (at least if it's the same on the SB2 where without this it takes a couple of seconds) and notifications should be displayed when you can detach the clipboard and after it has been re-attached.
battery looks good!
~$ cat /sys/class/power_supply/BAT0/energy_full
22150000
~$ cat /sys/class/power_supply/BAT1/energy_full
61340000
Charging status is updated as well when I plug/unplug.
I'll let the battery drain a little and then compare it to what windows reports as well.
Detach is REALLY fast now (I think faster than windows) but when I reattach, the trackpad/keyboard are no longer recognized. ~I didn't see any notifications, let me check that the daemon was running at the time as well.~ (edit: I was missing the user dtx service, enabling/starting that gives me the notification now, but the keyboard and trackpad still don't work).
~EDIT: I think the change broke reboot/shutdown as well. After X exits, I see surface_sam_sid_battery.1: unhandled power event (cid = 0x53) repeated every 10 seconds.~
edit 2: It looks like the shutdown/reboot hang only happens after detaching and reattaching. I just tried again w/o detaching and rebooting worked fine.
Okay, could be that we have to enable the keyboard events again after re-attaching. Can you compile and run the module with debugging enabled (un-comment the ccflags-y + -DDEBUG' line in the
Kbuild` file) and post a log of you trying to type after re-attachment? That should confirm if the events are missing or if there's something else going wrong.
You can also try to enable the events by running the sb3_test.py
script (I've just updated that) with the following parameters:
./scripts/sb3_test.py enable 0x02 0x15 0x00 0x01
./scripts/sb3_test.py enable 0x02 0x15 0x00 0x03
if that doesn't work, try running
./scripts/sb3_test.py enable 0x02 0x21 0x01 0x00
before the above commands. if the events have been disabled, those should enable them again.
Regarding shutdown: Does it shut down properly without detaching? Do you have a dGPU model? If so, and you have the drivers for that loaded, detaching could have caused the driver or something relying on it to crash, preventing a clean shutdown. In that case, can you try if you can shutdown when you have blacklisted every dGPU driver, i.e.
blacklist nouveau
blacklist lbm-nouveau
alias nouveau off
alias nouveau-lbm off
blacklist nvidia
alias nvidia off
It should also be enough to just unload them before detaching, if there aren't any programs blocking that (Gnome has the tendency to do so).
In any case (regardless of you having a dGPU or not), can you check if the dmesg log indicates any driver crashes or oddities?
The unhandled power event (cid = 0x53)
should not cause/indicate a shutdown failure. It's simply a message from the EC stating that some battery/power-consumption values have changed (unfortunately we don't know yet what specific change this indicates, thus us not handling it and thus the warning). See https://github.com/linux-surface/linux-surface/issues/136 and https://github.com/linux-surface/surface-aggregator-module/issues/29 for details.
You can also try to enable the events by running the sb3_test.py script (I've just updated that) with the following parameters:
This 1/2 worked. The keyboard came back 100%, and trackpad 50%. I lost multitouch gestures on the trackpad though (two finger right click, two finger scrolling, etc) I'll rebuild the module with debug logging on again as well in a second.
Regarding shutdown: Does it shut down properly without detaching?
Yep
Do you have a dGPU model?
Also yep.
If so, and you have the drivers for that loaded, detaching could have caused the driver or something relying on it to crash, preventing a clean shutdown
I think you're right here. I looked back at the logs from my past boot and saw a bunch of nouveau crashes. I'll clean up the log and attach it here.
Also, thanks again for all the work on this, its amazing so far!
dmesg output attached: dmesg-2.txt
This 1/2 worked. The keyboard came back 100%,
Neat!
and trackpad 50%. I lost multitouch gestures on the trackpad though (two finger right click, two finger scrolling, etc)
That actually makes perfect sense: To enable multitouch on the trackpad, a special feature report has to be sent. This is handled by the HID driver when the device is created/added the first time. Detaching the base will basically disconnect the device without it being removed in the driver, so attaching it won't really add it in the driver, so the HID subsystem won't send the special feature report to enable multitouch.
This means that we'll have to fully remove the input devices on detach and re-add them on attach.
I think you're right here. I looked back at the logs from my past boot and saw a bunch of nouveau crashes. I'll clean up the log and attach it here.
The same issues are also present on the SB2, so was to be expected. The log shows that the crashes occur on disconnect, so that's very likely what's causing them. And most crashes like these block shutdown. This is (next to power-savings) one reason I've disabled the dGPU for the SB2 by default. Normally, you should be able to unload the module before detaching, but Gnome for some reason keeps a finger on the module and doesn't let you, if you've started Gnome/GDM with the dGPU turned on (if you turn it on later, it works). Never found out how to avoid that and why Gnome does that. The SB3 seems a bit different on how dGPU on/off is handled, but I only had a quick glance so far. That's another thing for the future, for now I'd like to focus on the keyboard driver.
That actually makes perfect sense: To enable multitouch on the trackpad, a special feature report has to be sent. This is handled by the HID driver when the device is created/added the first time. Detaching the base will basically disconnect the device without it being removed in the driver, so attaching it won't really add it in the driver, so the HID subsystem won't send the special feature report to enable multitouch.
This means that we'll have to fully remove the input devices on detach and re-add them on attach.
Oh that does make sense!
The SB3 seems a bit different on how dGPU on/off is handled, but I only had a quick glance so far. That's another thing for the future, for now I'd like to focus on the keyboard driver.
I was able to remove the nouveau
driver after a couple tries actually, but I also am fine keeping it off for now, I don't have much of a use for the dGPU in linux right now anyways, and like you said, it's good for power savings.
one thing I noticed is that even with drivers disabled, the dGPU is always on, which makes it get pretty hot. I spent some time last night poking around and found a few things. This is mostly just a brain dump:
The san device is no longer there on the SB3, so its driver never loads, which means the hps driver fails here. I commented this out to just get further, it seems it's controlled via the ssh device now.
The ACPI entry(?) changed from RP5 to RP13. The GUID is the same it seems though.
Now here's the confusing part. Once we get into the ACPI data, we end up here. However, I can't actually find the PCI0.RP13.MRGT
method anywhere, which also is confirmed by the fact that it fails from the driver as well. In the SB2 ACPI dump it's defined here.
It looks like the dGPU can be turned on/off via ACPI (HGOF, HGON), I see it in the SB2 dumps as well, maybe this can be used instead?
Only had a quick look at it so far, so take this with a grain of salt.
- It looks like the dGPU can be turned on/off via ACPI (HGOF, HGON), I see it in the SB2 dumps as well, maybe this can be used instead?
This looks (mostly) exactly like the DSM function part on the SB2. This is the code called by __shps_dgpu_dsm_set_power_unlocked
. By itself, this method is enough to cut power to the dGPU, at least on the SB2, so I'd assume it'd do the same on the SB3. You could try comment out stuff in the driver so that it'll only call that function. NB: This is the only function that I initially used, the root-port power setting stuff all came later as an improvement.
The reason I also do some root port power settings are to properly update the PCIe core drivers (and via that userspace). The PCIe settings should actually work (that's not really that Surface specific) as long as the dGPU is the only thing connected to the root port (which it looks like). Only problem is to get the root port or dGPU PCIe addres. On that note
- The ACPI entry(?) changed from RP5 to RP13. The GUID is the same it seems though.
GUID being the same should mean that the semantics of the DSM haven't changed. So the interface should be the same (apart from bugs...).
You could try to make an acpidump of your own and grep for it, in case the one here is somehow incomplete. But this could also be a bug in the DSDT. Maybe MS isn't using that part and hard-code it somewhere. If I have some time I'll try to dig into the Windows driver.
- The san device is no longer there on the SB3, so its driver never loads, which means the hps driver fails here. I commented this out to just get further, it seems it's controlled via the ssh device now.
Yeah, The SAN device used to send notifications when the dGPU would power up (after resume). This is actually quite useful to synchronize the PCIe core and/or get it to turn off again if it's an unwanted power-up. No clue how that behaves on the SB3.
thanks for the comments! I'll try out your suggestions and report back later today.
By itself, this method is enough to cut power to the dGPU, at least on the SB2, so I'd assume it'd do the same on the SB3.
This seems like a great place to start, then I can try adding back functionality incrementally as I figure out the missing parts.
You could try to make an acpidump of your own and grep for it
Yeah I checked in mine too and its missing. I was also going to check the windows driver and see what it's doing here. Very strange though, your theory of them hard coding it into the driver seems most likely.
Turning off the dGPU worked with the suggestion above. I'm going to try to figure out why the GPIOs aren't working now.
[Wed Jun 10 13:37:48 2020] genirq: Setting trigger mode 3 for irq 165 failed (intel_gpio_irq_type+0x0/0x130 [pinctrl_intel])
[Wed Jun 10 13:37:48 2020] surface_dgpu_hps MSHW0153:00: base irq failed: -1
I've opened some new issues in an attempt to clean up this discussion a bit and make it easier to follow specific parts. Here's a list of all relevant issues:
Would be great if we could move the specific discussions to those.
awesome, that's a great idea, thank you!
I've described some stuff that I'd love for you guys to test over on https://github.com/linux-surface/surface-aggregator-module/issues/40#issuecomment-642976709 (also has a bit of explanation regarding SAM, which might or might not be interesting to you if you're confused about my ramblings here and in those issues).
I also think it'd be useful to identify the specific keyboard/touchpad devices that are used: https://github.com/linux-surface/surface-aggregator-module/issues/42#issuecomment-642880467.
Thanks again @qzed I'll make time to get to all of this over the weekend
Tried installing both Arch and ubuntu in my new SB3 13.5 i7 32gb
In both cases uname reports the surface kernel however both the keyboard and trackpad do not function.
Is there anything I can do to help support this device?
Currently am back on windows but happy to install any distro again to test..