lkrg-org / lkrg

Linux Kernel Runtime Guard
https://lkrg.org
Other
404 stars 72 forks source link

CI: Re-enable AArch64 test, but without BTI #237

Closed solardiz closed 1 year ago

solardiz commented 1 year ago

Description

  1. Re-enable AArch64 test in our GitHub Actions. The test had been introduced in #181, but promptly disabled since it was failing all the time because of our lack of support of BTI in our hackish calls to non-exported kernel functions, tracked as #183.
  2. Now we bypass that issue by specifying to QEMU -cpu cortex-a57 instead of -cpu max, so that we do test on AArch64 but without BTI. I got -cpu cortex-a57 from https://wiki.ubuntu.com/ARM64/QEMU
  3. We also switch to testing on Ubuntu Jammy instead of Ubuntu Impish, as the latter had been EOL'ed and is no longer downloadable (at least by this test as previously configured).

How Has This Been Tested?

The testing is described in some detail in comments I added today to #181. In short, this test now passes in my fork, whereas a different test with only re-enabling and switching to Jammy, but keeping -cpu max, fails with the BTI error like before. So all 3 of the changes here are required.

solardiz commented 1 year ago

@vt-alt Do you have any advice on whether/how we can possibly have two variations of this test at once - with and without BTI?

In a private fork of this repo, @redplait and I have just tried adding an extra args block:

                    # test with BTI
                    - args: image=arm64v8/ubuntu:jammy
                            opts="-M virt,gic-version=3 -cpu max"
                            console=ttyAMA0
                            root=/dev/vda
                            qemu=qemu-system-aarch64
                      install: qemu-system-arm

However, this didn't result in an extra job running, perhaps because both have image=arm64v8/ubuntu:jammy and that becomes the job name. So how can we have different names yet use the same image?

solardiz commented 1 year ago

However, this didn't result in an extra job running, perhaps because both have image=arm64v8/ubuntu:jammy and that becomes the job name. So how can we have different names yet use the same image?

No, I'm wrong - this did result in both jobs running, confusingly under the same name in GitHub UI, but clearly separated in e-mail that GitHub sent me:

  * build (image=arm64v8/ubuntu:jammy opts="-M virt,gic-version=3 -cpu cortex-a57" console=ttyAMA0 ro... succeeded (2 annotations)
  * build (image=arm64v8/ubuntu:jammy opts="-M virt,gic-version=3 -cpu max" console=ttyAMA0 root=/dev... failed (3 annotations)

So the question is - can we assign them different names, to make it easier to distinguish them in GitHub web UI? @vt-alt

vt-alt commented 1 year ago

Each (-) element in include array is a hash table, so we can add any key, like name: before args: and it should be displayed first. It should be like:

                    - name: test with BTI
                      args: image=arm64v8/ubuntu:jammy
                            opts="-M virt,gic-version=3 -cpu max"
                            console=ttyAMA0
                            root=/dev/vda
                            qemu=qemu-system-aarch64
                      install: qemu-system-arm
vt-alt commented 1 year ago

I don't see any change anywhere, perhaps it's helped.

solardiz commented 1 year ago

@vt-alt Your answer was helpful, thank you! We did not make use of it yet, simply keeping two tests as above in a private repo and that was sufficient for now. We might add the name thing later, for BTI or something else.

BTW, it could be useful for us to be able to enable BTI and/or PAC separately - right now, -cpu max appears to enable both at once. I did not yet look into whether/how QEMU can enable them separately, but we'll need separate changes to support them, so testing once change at a time could be cleaner.

vt-alt commented 1 year ago

-cpu max appears to enable both at once.

How do you see that PAC is enabled? As I understood PAC is ptrauth/pauth, and query-cpu-model-expansion shows:

(QEMU) query-cpu-model-expansion type=full model={"name":"max"}
{
    "return": {
        "model": {
            "name": "max",
            "props": {
                "aarch64": true,
                "kvm-no-adjvtime": false,
                "kvm-steal-time": true,
                "pauth": false,                  <-------
                "pmu": true,
                "sve": false,
                "sve1024": false,
                "sve1152": false,
                "sve128": false,
                "sve1280": false,
                "sve1408": false,
                "sve1536": false,
                "sve1664": false,
                "sve1792": false,
                "sve1920": false,
                "sve2048": false,
                "sve256": false,
                "sve384": false,
                "sve512": false,
                "sve640": false,
                "sve768": false,
                "sve896": false
            }
        }
    }
}

So pauth is false on max on QEMU 7.1.0. (And it can be controlled with -cpu pauth= option.)

I didn't find how to check from inside the system if pointer authentication is enabled.

vt-alt commented 1 year ago

It seems the info should be in ID_AA64ISAR1_EL1 register.

In 2017 https://lore.kernel.org/lkml/1500480092-28480-5-git-send-email-mark.rutland@arm.com/

From ARMv8.3 onwards, ID_AA64ISAR1 is no longer entirely RES0, and now
has four fields describing the presence of pointer authentication
functionality:

* APA - address authentication present, using an architected algorithm
* API - address authentication present, using an IMP DEF algorithm
* GPA - generic authentication present, using an architected algorithm
* GPI - generic authentication present, using an IMP DEF algoithm

This patch adds the requisite definitions so that we can identify the
presence of this functionality. For the timebeing, the features are
hidden from userspace.

I tried to read ID_AA64ISAR1_EL1 from userspace when -cpu pauth=on and off:

ON:  ID_AA64ISAR1_EL1    : 0x0000000000011001
OFF: ID_AA64ISAR1_EL1    : 0x0000000000011001

So there is no difference.

ps. APA should be bits [7:4], so it' 0: Address Authentication using an Architected algorithm is not implemented. pps. And I understand it maybe still present to/in the kernel, but we don't see it.

vt-alt commented 1 year ago

ppps. Documentation/arm64/cpu-feature-registers.rst says

  5) ID_AA64ISAR1_EL1 - Instruction set attribute register 1

     +------------------------------+---------+---------+
     | Name                         |  bits   | visible |
     +------------------------------+---------+---------+
...
     +------------------------------+---------+---------+
     | APA                          | [7-4]   |    y    |
     +------------------------------+---------+---------+

So it should be visible. But there is 0. ps.

arch/arm64/kernel/cpufeature.c:

static const struct arm64_ftr_bits ftr_id_aa64isar1[] = {
...
        ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_PTR_AUTH),
                       FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_EL1_GPI_SHIFT, 4, 0),
        ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_PTR_AUTH),
                       FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_EL1_GPA_SHIFT, 4, 0),
...
        ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_PTR_AUTH),
                       FTR_STRICT, FTR_EXACT, ID_AA64ISAR1_EL1_API_SHIFT, 4, 0),
        ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_PTR_AUTH),
                       FTR_STRICT, FTR_EXACT, ID_AA64ISAR1_EL1_APA_SHIFT, 4, 0),
vt-alt commented 1 year ago

On test VM:

root@aarch64:/.in# zcat /proc/config.gz |grep CONFIG_ARM64_PTR_AUTH
CONFIG_ARM64_PTR_AUTH=y
CONFIG_ARM64_PTR_AUTH_KERNEL=y

So FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_PTR_AUTH), should be true. Also it seems there should be CPU features: detected line in dmesg Address authentication but it's not there. Boot with -cpu max,pauth=on:

root@aarch64:/.in# dmesg|grep -i CPU.feature
[    0.000000] CPU features: detected: GIC system register CPU interface
[    0.000000] CPU features: detected: Hardware dirty bit management
[    0.000000] CPU features: detected: Spectre-v4
[    0.000000] CPU features: kernel page table isolation forced OFF by mitigations=off
[    0.176424] CPU features: detected: Common not Private translations
[    0.176425] CPU features: detected: CRC32 instructions
[    0.176426] CPU features: detected: Data cache clean to Point of Persistence
[    0.176428] CPU features: detected: LSE atomic instructions
[    0.176429] CPU features: detected: Privileged Access Never
[    0.176429] CPU features: detected: RAS Extension Support
vt-alt commented 1 year ago

Ah, qemu command lines so long, and there was additional -cpu max option. When I deleted it qemu fails with:

qemu-system-aarch64: 'pauth' feature not supported by KVM on this host

Then when I disable KVM, auth is there:

root@aarch64:/.in# dmesg|grep -i CPU.feature|grep -i auth
[    0.000000] CPU features: detected: Address authentication (architected QARMA5 algorithm)
[    0.538645] CPU features: detected: Generic authentication (architected QARMA5 algorithm)

When I run with -cpu max,pauth=off these lines disappear. When I run -cpu max they appear. (Note that at above posts I was running -cpu max in KVM, now in TCG.)

So how we know method of selectively disabling/enabling pauth.

vt-alt commented 1 year ago

Also we can disable bti/pauth support from kernel command line (since v5.12) by passing arm64.nobti and arm64.nopauth. This maybe sufficient for our purposes!

solardiz commented 1 year ago

-cpu max appears to enable both at once.

How do you see that PAC is enabled?

Indirectly: when @redplait worked around BTI in a fork/branch of LKRG, the next hurdle was failing pCFI, meaning that addresses on the stack were not what our code was expecting, which we assumed was a result of PAC (it's as expected).

Then when I disable KVM, auth is there:

root@aarch64:/.in# dmesg|grep -i CPU.feature|grep -i auth
[    0.000000] CPU features: detected: Address authentication (architected QARMA5 algorithm)
[    0.538645] CPU features: detected: Generic authentication (architected QARMA5 algorithm)

When I run with -cpu max,pauth=off these lines disappear. When I run -cpu max they appear.

Cool. BTW, how could you even have KVM before if your host isn't AArch64, or is it?

Also we can disable bti/pauth support from kernel command line (since v5.12) by passing arm64.nobti and arm64.nopauth. This maybe sufficient for our purposes!

Great. So we have two ways to control these features separately. Thank you!

vt-alt commented 1 year ago

BTW, how could you even have KVM before if your host isn't AArch64, or is it?

I experimented with this locally on 鲲鹏920 didn't realizing it doesn't have these features.