hashlash / devices

Devices Configuration
0 stars 0 forks source link

AER: PCIe Bus Error #2

Open hashlash opened 3 years ago

hashlash commented 3 years ago

Repeated log on /var/log/kern.log and /var/log/syslog which (currently) pile up to 29GB.

Nov 21 19:52:39 hashlash-K401UQK kernel: [    2.709288] pcieport 0000:00:1c.5: AER: Corrected error received: 0000:00:1c.5
Nov 21 19:52:39 hashlash-K401UQK kernel: [    2.709293] pcieport 0000:00:1c.5: AER: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
Nov 21 19:52:39 hashlash-K401UQK kernel: [    2.709295] pcieport 0000:00:1c.5: AER:   device [8086:9d15] error status/mask=00000001/00002000
Nov 21 19:52:39 hashlash-K401UQK kernel: [    2.709296] pcieport 0000:00:1c.5: AER:    [ 0] RxErr                 

The output of lspci

$ lspci -nn | grep 1c.5
00:1c.5 PCI bridge [0604]: Intel Corporation Sunrise Point-LP PCI Express Root Port #6 [8086:9d15] (rev f1)
hashlash commented 3 years ago

https://askubuntu.com/questions/771899/pcie-bus-error-severity-corrected

TL;DR

Add pci=nomsi or pci=noaer to boot parameter, on /etc/default/grub (assuming the default parameters are quiet and splash):

...
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash pci=nomsi"
...
hashlash commented 3 years ago

Find and remove with sed

$ sed -i '
/0000:00:1c.5$/!bx;N;
/(Receiver ID)$/!bx;N;
/00002000$/!bx;N;
/RxErr\s*$/!bx;d;
:x' /var/log/syslog

TODO: replace consecutive pattern with (if possible) something like:

Nov 21 19:52:39 hashlash-K401UQK kernel: [    2.709288] pcieport 0000:00:1c.5: AER: Corrected error received: 0000:00:1c.5
Nov 21 19:52:39 hashlash-K401UQK kernel: [    2.709293] pcieport 0000:00:1c.5: AER: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
Nov 21 19:52:39 hashlash-K401UQK kernel: [    2.709295] pcieport 0000:00:1c.5: AER:   device [8086:9d15] error status/mask=00000001/00002000
Nov 21 19:52:39 hashlash-K401UQK kernel: [    2.709296] pcieport 0000:00:1c.5: AER:    [ 0] RxErr
Nov 21 19:52:39 hashlash-K401UQK kernel: [    2.709297] pcieport 0000:00:1c.5: AER: ----- repeated log 123 times -----

I think the log may be useful :thinking: