ufrisk / pcileech-fpga

FPGA modules used together with the PCILeech Direct Memory Access (DMA) Attack Software
917 stars 205 forks source link

[PCIe Squirrel] PC crashed during the flashing procedure #103

Closed benschlueter closed 2 years ago

benschlueter commented 2 years ago

Hello,

my PC freezes during the flashing procedure and then restarts, without powering the PCIe card. I have to unplug and replug the card again to be recognized as PCIe device. The openocd log looks as follows

ben@192:~/pciestuff/flash_screamer$ openocd -f flash_screamer_squirrel.cfg
Open On-Chip Debugger 0.11.0+dev-00687-g7d2ea186c (2022-05-19-14:21)
Licensed under GNU GPL v2
For bug reports, read
    http://openocd.org/doc/doxygen/bugs.html
DEPRECATED! use 'adapter driver' not 'interface'
DEPRECATED! use 'ftdi vid_pid' not 'ftdi_vid_pid'
DEPRECATED! use 'ftdi channel' not 'ftdi_channel'
DEPRECATED! use 'ftdi layout_init' not 'ftdi_layout_init'
Info : auto-selecting first available session transport "jtag". To override use 'transport select <transport>'.
DEPRECATED! use 'adapter speed' not 'adapter_khz'
Info : ftdi: if you experience problems at higher adapter clocks, try the command "ftdi tdo_sample_edge falling"
Info : clock speed 10000 kHz
Info : JTAG tap: xc7.tap tap/device found: 0x0362d093 (mfg: 0x049 (Xilinx), part: 0x362d, ver: 0x0)

The bitstream is downloaded as prebuild binary and also self-compiled, the error stays the same. Any idea what is causing it?

benschlueter commented 2 years ago

When the device is not recognized as PCIe device and I flash it, it works. But the kernel should not freeze during the flashing procedure in the first place right?

ufrisk commented 2 years ago

Is the device working now? Did you manage to flash it with your custom firmware?

benschlueter commented 2 years ago

Yes, it works. However, I find it a little strange that the system crashes when I want to reflash the firmware. Not sure what the root cause is.

ufrisk commented 2 years ago

I haven't heard about this behavior before.

My educated guess is that if your kernel were somehow communicating with the device at the time of flashing it may have failed like this. When flashing the current config is instantly wiped on the FPGA to load the flash program and the PCIe core will be brought down abruptly from the host system point of view.

I don't know how to avoid this except for powering the device separately when flashing. I think the issue is rather uncommon though (since I haven't heard about it since before).

If my guess is correct there is not much my project can do about it really. It's more of an OpenOCD / Linux issue.

I'll leave this issue open for a month or so just in case others have this problem also. If you have a similar problem please let me know in this thread.

lampii-temporary commented 2 years ago

I am having similar issues although it seems unrelated to flashing. The PC is throwing a CLOCK WATCHDOG TIMEOUT bsod or a freeze on boot when windows initializes. This happens on every boot and the only way to bypass is to reset the pci card at some point after bios but before windows starts.

ufrisk commented 2 years ago

@lampii This is a separate issue. Do you have any custom firmware loaded on it, or it it the stock firmware you're experiencing these issues? Do you have any overclocking of the PCIe enabled. If so can you please try disabling it. A BIOS upgrade have some times been known to resolve some issues.

lampii-temporary commented 2 years ago

Hi @ufrisk thanks for the reply! Stock firmware, no overclocking. Currently running on an old Asus P9x79, i7 3820 Bios is very old 2012, newest is from 2014. I will give it a shot and post results on a new issue.

ufrisk commented 2 years ago

Please let me know how it goes. Sometimes it may help changing slots as well. Or it may be a hardware issue. But start with a BIOS upgrade and changing slots (if possible).

benschlueter commented 2 years ago

I also experience some hardware crashes with the Squirrel when sending RAW TLPs. I.e. this crashed my system (./pcileech tlp -vvv -in 20000001e30080ff0000003000c00000). Not sure if it is related to chipset/hardware/firmware/ or wrong TLP (but this should be fine it is just a basic 4-byte read). The system is a dual-socket 3rd Gen Intel Xeon Scalable one.

ufrisk commented 2 years ago

Your TLP is malformed. It has byte enable for both first and last dword whilst you're only reading one dword.

That doesn't work on my system either, but that system just drops the packet and doesn't crash. It may be something else as well, but start looking at the BE's.

ufrisk commented 2 years ago

I'm closing this issue since I believe it has been answered. If there is still any questions please let me know.