Dasharo / dasharo-issues

The Dasharo issue tracker
https://dasharo.com/
24 stars 0 forks source link

Protectli jsl V1210|test case: STB002.001| dmesg flooded with PCIe errors #943

Open wiktormowinski opened 2 months ago

wiktormowinski commented 2 months ago

Component

Dasharo firmware, other

Device

protectli v1210

Dasharo version

0.9.2

Dasharo Tools Suite version

-

Test case ID

STB002.001

Brief summary

During the STB002.001 test case an error pops up

How reproducible

happens every time

How to reproduce

run STB002.001 on protectli v1210 with dasharo 0.9.2

Expected behavior

PASSing the STB002.001 criteria

Actual behavior

Shortly after loading the test case an error message pops up:

ath10k_pci 0000:04:00.0: AER: Error of this Agent is reported first: 'Bluetooth: hci0: Malformed MSFT vendor event: 0x02' does not contain 'tpm tpm0: [Firmware Bug]: TPM interrupt not working, polling instead'

Screenshots

No response

Additional context

No response

Solutions you've tried

No response

macpijan commented 1 month ago

@wiktormowinski Why choosing other device when submitting a bug? You should be able to correclty choose V1210 right?

macpijan commented 1 month ago

The mentioned error looks like:

[   99.407234] ath10k_pci 0000:04:00.0: AER:   Error of this Agent is reported first
[  100.174008] pcieport 0000:00:1c.4: AER: Multiple Correctable error message received from 0000:04:00.0
[  100.174057] ath10k_pci 0000:04:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
[  100.174060] ath10k_pci 0000:04:00.0:   device [168c:003e] error status/mask=00000081/00006000
[  100.174065] ath10k_pci 0000:04:00.0:    [ 0] RxErr                  (First)
[  100.174067] ath10k_pci 0000:04:00.0:    [ 7] BadDLLP               
[  102.184718] pcieport 0000:00:1c.4: AER: Correctable error message received from 0000:04:00.0
[  102.184762] ath10k_pci 0000:04:00.0: PCIe Bus Error: severity=Correctable, type=Data Link Layer, (Receiver ID)
[  102.184766] ath10k_pci 0000:04:00.0:   device [168c:003e] error status/mask=00000080/00006000
[  102.184770] ath10k_pci 0000:04:00.0:    [ 7] BadDLLP               
[  102.765604] pcieport 0000:00:1c.4: AER: Multiple Correctable error message received from 0000:04:00.0
[  102.765679] ath10k_pci 0000:04:00.0: PCIe Bus Error: severity=Correctable, type=Data Link Layer, (Receiver ID)
[  102.765683] ath10k_pci 0000:04:00.0:   device [168c:003e] error status/mask=00000080/00006000
[  102.765687] ath10k_pci 0000:04:00.0:    [ 7] BadDLLP               
[  104.187338] pcieport 0000:00:1c.4: AER: Correctable error message received from 0000:04:00.0
[  104.187385] ath10k_pci 0000:04:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
[  104.187389] ath10k_pci 0000:04:00.0:   device [168c:003e] error status/mask=00000081/00006000
[  104.187393] ath10k_pci 0000:04:00.0:    [ 0] RxErr                  (First)
[  104.187396] ath10k_pci 0000:04:00.0:    [ 7] BadDLLP               

It is repetitive, flooding dmesg log. It is rather related to the WiFi card firmware.

Perhaps the workaround from https://kb.protectli.com/kb/wifi-on-the-vault/ is not perfect here.

wiktormowinski commented 1 month ago

@wiktormowinski Why choosing other device when submitting a bug? You should be able to correclty choose V1210 right?

There is no v1210 on the drop-down form list, im afraid. I edited it manually.

krystian-hebel commented 1 month ago

This error doesn't suggest the error with TPM, the line printed as TPM line is just one of allowed error lines in the output of dmesg: https://github.com/Dasharo/open-source-firmware-validation/blob/7c18adbea823700a5d8a06cd750a5f3b4ea8fca0/lib/linux.robot#L37

Perhaps the workaround from https://kb.protectli.com/kb/wifi-on-the-vault/ is not perfect here.

That workaround is for missing/wrong WiFi firmware. PCIe bus errors are separate issue.

krystian-hebel commented 1 month ago

Disabling ASPM L0s for port 5 (00:1c.4) makes these errors go away. However, when it is enabled on AMI firmware, there are no errors reported. This excludes hardware problem, it is a Dasharo issue. the errors still appear, but much less often, about once every 3 minutes.