system76 / firmware-open

System76 Open Firmware
Other
945 stars 84 forks source link

adl: Devices in TBT port not detected on boot #343

Open crawfxrd opened 1 year ago

crawfxrd commented 1 year ago

On the ADL models, any device (storage, display) in the TBT port is not detected at boot.

Steps to reproduce

Expected behavior

Actual behavior

Additional info

redsie commented 1 year ago

Lemur Pro 11 / i7-1255U - the same issue, opened a ticket (ID 85021) with System76. I think it may be related to the default thunderbolt security level set to 'user'.

adeptg commented 1 year ago

Same here with darp8 and the external USB-C display: when the display is connected, and I reboot the laptop, the laptop doesn't detect the display till I unplug it and plug it again. I reported the issue to support a few days ago (ticket ID is 100629), don't have an answer yet.

asiftali commented 1 year ago

Same. I also opened a ticket (103898) which led me here.

I can also confirm that on my lemp11 that the output of boltctl domains is showing the security level as iommu+user. I think having the ability to change this to something like secure and in addition being able to add bootacls is what I've seen being used in other systems.

Kirizan commented 1 year ago

I am facing the same issue. It's quite annoying to have to log into the OS just to use my keyboard, mouse, and monitor. It really makes it impossible to use the laptop closed. I'm still very new to using Linux as a desktop os, so I'm glad it's not just me with this issue.

antonkulaga commented 1 year ago

EGPU does not work for me at all. Even plugging and unplugging do not help. I wrote to System76 support, but they were pretty useless and did not suggest anything (other than installing newer driver that I did myself together with many other options, like egpu-switcher that never worked)

leviport commented 1 year ago

but they were pretty useless

reminder: https://github.com/pop-os/code-of-conduct

jtegtmeier commented 1 year ago

Seems to be a similar issue with oryp10. Unplugging and plugging the Thunderbolt/USB-C dock back in resolves the issue temporarily (edit: until next boot), but is quite tedious.

shubb30 commented 1 year ago

I am having a similar (yet slightly different) issue. I have a Darter Pro, and am using a CalDigit TB dock. When I plug in my two monitors, they are both detected correctly. As soon as I reboot, they are not detected. If I booth with the TB disconnected, and then plug in the TB dock, only one monitor is detected, and the second is not. Thinking the cable was bad, I swapped the cables between the two monitors, and they both were detected again. Rebooting, brought be back to the same problem where only 1 works when plugging the dock in after boot.

ravdiculous commented 10 months ago

For visibility, I am seeing this issue on the orpy11. Thank you to anyone who has time to work on this!

dustinlennon commented 10 months ago

I want to thank Francis in customer support for bringing my attention to this bug report / thread. I'm also running into the same issue on lemp11.

One of the things I looked into was how udev reports before / after a replugging. That is, running the following command:

sudo udevadm info --attribute-walk --path=/sys/devices/pci0000:00/0000:00:0d.2/domain0/0-0

In what follows, recall that for the lemp11, 0000:00:0d.2 has the relevant lspci entry:

00:0d.2 USB controller: Intel Corporation Alder Lake-P Thunderbolt 4 NHI #0 (rev 04)

Before the replugging, I obtained

  looking at device '/devices/pci0000:00/0000:00:0d.2/domain0/0-0':
    KERNEL=="0-0"
    SUBSYSTEM=="thunderbolt"
    ATTR{authorized}=="1"
    ATTR{device}=="0x463e"
    ATTR{device_name}=="Gen12"
    ATTR{generation}=="4"
    ATTR{power/async}=="disabled"
    ATTR{power/control}=="auto"
    ATTR{power/runtime_enabled}=="enabled"
    ATTR{power/runtime_status}=="suspended"
    ATTR{unique_id}=="a2a78780-5064-250f-ffff-ffffffffffff"
    ATTR{vendor}=="0x8087"
    ATTR{vendor_name}=="INTEL"
    ATTR{waiting_for_supplier}=="0"

and after,

  looking at device '/devices/pci0000:00/0000:00:0d.2/domain0/0-0':
    KERNEL=="0-0"
    SUBSYSTEM=="thunderbolt"
    ATTR{authorized}=="1"
    ATTR{device}=="0x463e"
    ATTR{device_name}=="Gen12"
    ATTR{generation}=="4"
    ATTR{power/async}=="disabled"
    ATTR{power/control}=="auto"
    ATTR{power/runtime_enabled}=="enabled"
    ATTR{power/runtime_status}=="active"
    ATTR{unique_id}=="a2a78780-5064-250f-ffff-ffffffffffff"
    ATTR{vendor}=="0x8087"
    ATTR{vendor_name}=="INTEL"

The key difference that power/runtime_enabled goes from "suspended" to "active"

Looking at some of the change logs in /boot/efi/system76-firmware-update/firmware, and after a cursory review of the git logs, it seems like one of the themes of recent updates have been related to suspension/power settings.

I'm wondering if this connection might be helpful in resolving the replugging issue. Is it possible to change the firmware code so that the NHI be brought up in "active" mode? Alternatively, is there something that can be done post-boot, like triggering an event, that might force the NHI into "active" mode?

It's possible (likely) that I'm barking up the wrong tree. If so, my apologies for the noise, and please let me know so I can turn this into a learning opportunity.

thanks! Dustin

dustinlennon commented 9 months ago

@crawfxrd, I've been back and forth with Francis in the last three weeks. He has insinuated that this issue isn't likely to be addressed any time soon. I'm wondering if you have any more information on its prioritization. Thanks!

leviport commented 9 months ago

It's prioritized relatively highly and we are working on it. We just haven't gotten to the root cause yet. We will keep this issue updated when there are updates to share. No need for the pings.

ravdiculous commented 9 months ago

It's prioritized relatively highly and we are working on it. We just haven't gotten to the root cause yet. We will keep this issue updated when there are updates to share. No need for the pings.

Thanks so much for the time your team is putting into this! If there are any logs that I can pull from my machine for you, please let me know. Otherwise, godspeed :vulcan_salute:

inferentialist commented 8 months ago

I received this update today from the System76 help desk:

Our tech support team is really not qualified to fix this issue. Please refer to the github issue for progress on this.

Apologies for pinging the thread again, but this seems to be where System76 expects customers to resolve these sorts of problems.

A digression

One reads the code of conduct and imagines that this github repo would be a contributor friendly thread. In fact, several of us here, albeit perhaps ignorantly, have enthusiastically volunteered to help in whatever way we can. However, there hasn't been any uptake making it difficult to contribute in a positive capacity.

It was also a bit surprising (to me, anyway) to discover that this forum is presented--again, through the code of conduct--as an open community to improve Pop!OS. Surprising, because it's also apparently intended to function as crowdsourced technical support for System76 customers.

... in terms of customer experience, this is all very uncomfortable :(

crawfxrd commented 8 months ago

There is nothing for users to contribute for this issue. I'm able to reproduce it. I'm unable to determine the cause.

leviport commented 8 months ago

There appears to be a bit of a misunderstanding, so I'd just like to clarify a bit.

Please refer to the github issue for progress on this.

was not intended to translate into

also apparently intended to function as crowdsourced technical support for System76 customers.

This is open source firmware, so customers get front row seats, if they want to spectate. Our Support Team was only pointing towards the first place where new information would be available. Since it's open source, customers are free to get involved if they want to, but we have no expectation that they do so. Hopefully that clarifies things a little. Thank you for your patience while we work on this tricky bug.

inferentialist commented 8 months ago

I didn't want to assume that open firmware is an issue for which there wouldn't be any customer support. So, the feedback is helpful in that regard. I've asked the help desk to close my ticket.

To that end, is there any possibility of reverting back to a manufacturer's firmware?

ravdiculous commented 8 months ago

@crawfxrd and @leviport, thank you again for all the time spent on this. There's only so much time one can spend banging their head against a problem like this. Especially when there are other demands on your time.

Keep on keeping on :fire:

crawfxrd commented 7 months ago

This is not resolved by:

Will have to look as coreboot and FSP configs next.

crawfxrd commented 7 months ago

Doesn't seem to be something as simple as changing Kconfigs or FSP values.

src/soc/intel/common/block/tcss/Kconfig is the only thing left that I see that looks relevant:

config ENABLE_TCSS_DISPLAY_DETECTION
    bool "Enable detection of displays over USB Type-C ports with TCSS"
    depends on SOC_INTEL_COMMON_BLOCK_TCSS && RUN_FSP_GOP
    help
      Enable displays to be detected over Type-C ports during boot.

config ENABLE_TCSS_USB_DETECTION
    bool "Enable detection of USB boot devices attached to USB Type-C ports with TCSS"
    depends on SOC_INTEL_COMMON_BLOCK_TCSS
    help
      Enable USB-C attached storage devices to be detected at boot.
      This option is required for some payloads (eg, edk2), without which devices attached
      to USB-C ports will not be detected and available to boot from.

Ref: CB:72909


Selecting those and adding a non-null usbc_get_ops makes detection of my USB storage devices work.

ddetton commented 7 months ago

I don't know if the issue I am working is related to this or not but here are the details. System76 case# 157412. For those that don't have access to S76 cases, here are the details:

I know that Windows is not officially supported for System76 laptops but I have a question about Thunderbolt. I have an Oryx Pro (oryp11) and I just installed Windows 11 on a dedicated nvme drive. It is mostly working except that it is having a problem with the Thunderbolt Controller. The device manager reports that "Windows has stopped this device because it has reported problems (code 43). I have installed all of the windows drivers on the system76 github and I still have this issue. I have run the Intel Driver Update utility and all drivers are reported current. I know that the thunderbolt device I am trying to use (Plugable TBT3-UDZ) is good because I can plug it into an older HP Windows 11 laptop and it works fine. The thunderbolt driver on the Windows 11 laptop is dated 9-3-2018 where the driver date on the oryp11 is 5-23-23. Btw, this same device works fine in pop-os! on the same hardware. I have also tried this on a fresh Windows 10 install with same results. Thunderbolt Controller will not load. Removing and reapplying the TB3 cable does not restore function. Also, if I plug a usb-c thumb drive into the TB3 port, it works fine.

mirsev commented 1 month ago

Hi, can I ask if this issue will be fixed in oryp10 soon?