hertg / egpu-switcher

🖥🐧 Setup script for eGPUs in Linux (X.Org)
GNU General Public License v3.0
588 stars 58 forks source link

Unable to read PCI information: pci config file has an invalid format #122

Open mszaro opened 1 day ago

mszaro commented 1 day ago

Hey there, I'm running into an issue that seems related to https://github.com/hertg/egpu-switcher/issues/91:

mszaro@miniserv:~$ sudo egpu-switcher config
[error] unable to read pci information from sysfs: got error while scanning device '0000:04:01.0': the pci 'config' file has an invalid format

This is with 0.19.0, manually installed by pulling the binary.

The device in question as seen by lspci:

04:01.0 PCI bridge: Intel Corporation JHL7540 Thunderbolt 3 Bridge [Titan Ridge DD 2018] (rev 06) (prog-if 00 [Normal decode])
        Subsystem: Intel Corporation JHL7540 Thunderbolt 3 Bridge [Titan Ridge DD 2018]
        !!! Unknown header type 7f
        Interrupt: pin ? routed to IRQ 127
        IOMMU group: 18
        I/O behind bridge: 00006000-00006fff [size=4K] [32-bit]
        Memory behind bridge: 5e000000-5e2fffff [size=3M] [32-bit]
        Prefetchable memory behind bridge: 0000006000000000-000000640fffffff [size=16640M] [64-bit]
        Kernel driver in use: pcieport

This seems to be unrelated to my actual eGPU itself, but for the sake of completeness, it's an RX 7800M in an external enclosure. I haven't installed any drivers manually, just attempting to roll with the amdgpu support already in the kernel. (Linux Mint 22 x64, kernel 6.8.0-49-generic).

hertg commented 1 day ago

Might be an issue similar to #119, which I wasn't able to reproduce/resolve at the time. If you could provide me the original config file for your device at 04:01.0, maybe it would be possible to debug this one. (see my comment here for instructions on how you can send it to me in encrypted form).

The !!! Unknown header type 7f in your lspci output makes me think that maybe my gopci library can't handle that device, maybe it reports an unexptect/invalid configuration...

mszaro commented 18 hours ago

Well, that's interesting. I would be happy to send it to you, but I think I just worked out why it doesn't parse; it's empty!

mszaro@miniserv:/sys/bus/pci/devices/0000:04:01.0$ du config
0       config

I can't explain what this device is - wondering if it's something virtual created to use a TB3 cable between two TB4 devices? - but at any rate since this is a non-GPU device, perhaps the tool could ignore devices with malformed configs?