ARM-software / bsa-acs

Arm SystemReady : BSA Architecture Compliance Suite
Apache License 2.0
16 stars 42 forks source link

Failure - BSA 804 : PHB,RP must recognize Txn frm upstream #26

Closed sunnywang-arm closed 1 year ago

sunnywang-arm commented 2 years ago

We run into a BSA 804 PCIe test failure below on an ARM platform.

804 : PHB,RP must recognize Txn frm upstream Failed. Exception on Memory Access For Bdf : 0x3000000 Failed on PE - 0 Checkpoint -- 1 : Result: FAIL

The problem is that BSA.efi directly accesses an address that falls within the non-prefetchable memory window in the SoC's PCIe bridge registers (Offset 0x20). The SoC vendor expected the BSA.efi to check ACPI _TRA (Address Translation) and get the offset for accessing the correct address, which is how OSes do for accessing the address. Does it make sense to update BSA.efi for using _TRA? Or using the _TRA is kind of workaround for hardware issues? If you think using _TRA is a workaround, could you give more information like a description in the specification to me?

For more information about _TRA, please check https://uefi.org/specs/ACPI/6.4/19_ASL_Reference/ACPI_Source_Language_Reference.html#qwordmemory-qword-memory-resource-descriptor-macro

gowthamsiddarthd commented 2 years ago

Hi @sunnywang-arm

We don’t have a complete ASL parsing code in the UEFI PAL layer today. So, this has not been fixed yet.

We will investigate extracting this information at UEFI level. Until then we can ignore this failure for the configuration/setup.

The test has been retained in the codebase, as you mentioned, in bare-metal environments and on configurations where this additional address translation is not required or programmed, the test works as expected.

Regards, ACS team

semihalf-bernacki-grzegorz commented 2 years ago

Hi @gowthamsiddarthd, I run BSA Linux test and it fails for test case "861 : PCIe Unaligned access", I believe that the root cause is still the same - accessing BAR memory without adding translation offset. Please see log belog:

[   86.121877]  861 : PCIe Unaligned access                 
[   86.127450] ------------[ cut here ]------------
[   86.137432] WARNING: CPU: 3 PID: 358 at ../arch/arm64/mm/ioremap.c:46 __ioremap_caller+0xd0/0xf8
[   86.146204] Modules linked in: bsa_acs
[   86.149944] CPU: 3 PID: 358 Comm: bsa Not tainted 5.15.0-00001-g96fa3e64d83f #1
[   86.157237] Hardware name: Marvell Inc. CN98XX-CRB/CN98XX-CRB, BIOS 0.4 Jul  5 2022
[   86.164874] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   86.171819] pc : __ioremap_caller+0xd0/0xf8
[   86.175989] lr : __ioremap_caller+0x58/0xf8
[   86.180159] sp : ffff800013eb3b40
[   86.183460] x29: ffff800013eb3b40 x28: ffff800008fff000 x27: ffff800008fff000
[   86.190584] x26: 0000000000000000 x25: 0000000005010000 x24: 0000000000000014
[   86.197708] x23: 0068000000000f0b x22: ffff800008ffb914 x21: 0000000000000000
[   86.204831] x20: 00000000100c0000 x19: 0000000000001000 x18: ffffffffffffffff
[   86.211954] x17: 53415020203a746c x16: 75736552203a2020 x15: ffff800011745000
[   86.219076] x14: 0000878050101000 x13: ffff800013eb3950 x12: 000000000000003f
[   86.226199] x11: 0000000000001000 x10: ffff80001e919fff x9 : ffff80001e91a000
[   86.233323] x8 : 00000000fbfd0000 x7 : 0000000000000018 x6 : ffff80001214b7c0
[   86.240446] x5 : ffff80001214b7c0 x4 : 0000000000000001 x3 : 0000000000000001
[   86.247568] x2 : 00000000fffd0000 x1 : 0000000000000000 x0 : 0000000000000001
[   86.254690] Call trace:
[   86.257126]  __ioremap_caller+0xd0/0xf8
[   86.260951]  __ioremap+0x4c/0x58
[   86.264168]  pal_memory_ioremap+0x64/0x160 [bsa_acs]
[   86.269129]  val_memory_ioremap+0x10/0x20 [bsa_acs]
[   86.274001]  payload+0x2cc/0x490 [bsa_acs]
[   86.278093]  val_run_test_payload+0x38/0xa0 [bsa_acs]
[   86.283139]  os_p061_entry+0x48/0x88 [bsa_acs]
[   86.287577]  val_pcie_execute_tests+0xec/0x220 [bsa_acs]
[   86.292882]  val_glue_execute_command+0x200/0x350 [bsa_acs]
[   86.298447]  bsa_proc_write+0x88/0x108 [bsa_acs]
[   86.303058]  proc_reg_write+0xb4/0xf0
[   86.306710]  vfs_write+0xc4/0x380
[   86.310015]  ksys_write+0x6c/0x100
[   86.313406]  __arm64_sys_write+0x1c/0x28
[   86.317317]  invoke_syscall+0x44/0x100
[   86.321056]  el0_svc_common.constprop.3+0x6c/0xf0
gowthamsiddarthd commented 1 year ago

Hi @sunnywang-arm,

As per our offline discussion, using the UEFI PCIe protocols defeat the purpose of the test. The vendor could be putting a quirk there to work around PCIe/BSA rules (could even hide a completely broken ECAM). But that issue will show up when the OS tries to access ECAM using the MMIO (just like BSA does). If you put a quirk in UEFI, you will find that you need to put a quirk in the OS PCIs drivers/kernel as well. So, we don't need to make a change in BSA because checking _TRA may overlook the case where partners use _TRA as a workaround for an HW issue.

Regards, Gowtham ACS Team

sunnywang-arm commented 1 year ago

@gowthamsiddarthd can we reopen this issue? We found that 1) all OSes including Windows OS support _TRA and 2) there is a need to use _TRA (so it is not a workaround) so we're working on ECRs for BSA spec. Therefore, we need to re-open the issue to update the bsa-acs later.

gowthamsiddarthd commented 1 year ago

Reopening the issues for further investigation based on the new input provided

Regards, ACS Team

snainar-ampere commented 1 year ago

Hi @gowthamsiddarthd , We also hit the same issue (804) in one of our platform. We are able to get pass if we use PCIIO protocol to access the BAR resource . Is there any effort going on parsing _TRA?

samerhaj commented 1 year ago

The failure is seen on a yet another system (with different SoC) that also has 32-bit MMIO BARs remapped over 4GB:

804 : Check RootPort NP Memory Access START

  BDF is 0x0
   Memory base is 0xFFF00000 Memory lim is  0xFFFFF
  BDF is 0x2000000
   Memory base is 0x40000000 Memory lim is  0x400FFFFF
       Received exception of type: 0
       Failed. Exception on Memory Access For Bdf : 0x2000000
       Failed on PE -    0
       PCI_IN_13
       Checkpoint –  1                           : Result:  FAIL 
         END