magic-blue-smoke / Dual-Edge-TPU-Adapter

Dual Edge TPU Adapter to use it on a system with single PCIe port on m.2 A/B/E/M slot
307 stars 4 forks source link

"PCIe error as occurred" at boot with a Dell R230 server #27

Open ralawa opened 2 years ago

ralawa commented 2 years ago

Hi,

I have just installed the PCIe adapter with the coral dual TPU board in a Dell R230 server. The server does not boot and complains about a PCIe error. Do you think, there is something to do to make it working?

The adapter works without any issue in a Dell desktop and the 2 TPUs are detected.

Thank you.

magic-blue-smoke commented 2 years ago

Hi @ralawa This would be the first known incompatible configuration if can't be fixed. My idea is that server trying some "smart" things which are not supported by adapter or can't fallback to Gen2 x1 mode Looking to BIOS settings I see option called "Slot Disablement" and from description it's not quite clear to me if it disables slot completely or only ignores Option ROM and UEFI drivers for it. Could you try this?

ralawa commented 2 years ago

Hi,

Thank you for your response.

When set to "disabled", the server boots but the adapter is not detected by the OS as expected by the BIOS help. And when set to "Boot Driver Disabled", the server boot but the Linux kernel crash during boot.

Regards.

magic-blue-smoke commented 2 years ago

Hi, Thank you for your response. When set to "disabled", the server boots but the adapter is not detected by the OS as expected by the BIOS help. And when set to "Boot Driver Disabled", the server boot but the Linux kernel crash during boot.

@ralawa thanks for trying these options. When Linux crashes, are there any informative logs/messages? Also, is it possible to try another PCIe slot?

ralawa commented 2 years ago

Hi,

In attachement, the kernel panic logs. I can only use one slot, the other PCIe slot is used by the PERC raid controller. I also upgraded the bios but same issue.

I believe that there is nothing more to do.

Regards.

IMG_20220916_111753

magic-blue-smoke commented 2 years ago

@ralawa I see, please contact me using form at the bottom of the page

reaperharvest commented 1 year ago

I have this same issue on a Dell r710

magic-blue-smoke commented 1 year ago

@reaperharvest unfortunately I don't have access Dell servers to reproduce and diagnose the issue. Please contact me using form at the bottom of the page for a refund and/or alternative options

magic-blue-smoke commented 1 year ago

There's something similar I see on Dell Support Forums. It seems to me that PCIe root complex fails to fall back to x1 from x4 (required by PCIe specs)

chino-lu commented 1 year ago

facing the same on an R330... R720 is working fine

cbrherms commented 9 months ago

Ran in to the same issue on my R330, but appears to be detected fine on my R430. I guess a change of plans of how i'll deploy and will just use that hardware instead.