HyperEnclave / hyperenclave

An Open and Cross-platform Trusted Execution Environment.
Apache License 2.0
130 stars 15 forks source link

Can't load HyperEnclaveDriver: hang on & need manual reboot machine #12

Closed linguohua closed 7 months ago

linguohua commented 8 months ago

Hi, I try HyperEnclave in Ubuntu20, when type 'bash start_hyperenclave.sh’, it always hang on, and freeze the whole system, I need to manual reboot the machine.

when comment out the following line in file: https://github.com/HyperEnclave/hyperenclave-driver/blob/0af18c3ea3a3935fd9bd5d08b901ff6a288b161d/driver/main.c#L456C2-L456C2

    preempt_disable();

    header->online_cpus = num_online_cpus();
    atomic_set(&call_done, 0);
    on_each_cpu(enter_hypervisor, header, 0);
    while (atomic_read(&call_done) != num_online_cpus())
        cpu_relax();

    preempt_enable();

compile HyperEnclaveDriver and run again, it now does not hang on, but dmesg show error log: Initialize CMRM fail

Environment:

cat /proc/cmdline

BOOT_IMAGE=/boot/vmlinuz-5.4.0-166-generic root=UUID=7eb35854-7835-49d7-8c68-b7eb7e8f296c ro memmap=32G$0x100000000 amd_iommu=off intremap=off no5lvl default_hugepagesz=1G hugepagesz=1G

uname -a

Linux t017372-172-25-10-21 5.4.0-166-generic #183-Ubuntu SMP Mon Oct 2 11:28:33 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

lsb_release -a

    No LSB modules are available.
    Distributor ID: Ubuntu
    Description:    Ubuntu 20.04.3 LTS
    Release:    20.04
    Codename:   focal

lscpu

    Architecture:                       x86_64
    CPU op-mode(s):                     32-bit, 64-bit
    Byte Order:                         Little Endian
    Address sizes:                      43 bits physical, 48 bits virtual
    CPU(s):                             128
    On-line CPU(s) list:                0-127
    Thread(s) per core:                 1
    Core(s) per socket:                 64
    Socket(s):                          2
    NUMA node(s):                       8
    Vendor ID:                          AuthenticAMD
    CPU family:                         25
    Model:                              1
    Model name:                         AMD EPYC 7T83 64-Core Processor
    Stepping:                           1
    Frequency boost:                    enabled
    CPU MHz:                            1794.019
    CPU max MHz:                        3500.0000
    CPU min MHz:                        1500.0000
    BogoMIPS:                           6986.98
    Virtualization:                     AMD-V
    L1d cache:                          4 MiB
    L1i cache:                          4 MiB
    L2 cache:                           64 MiB
    L3 cache:                           512 MiB
    NUMA node0 CPU(s):                  0-15
    NUMA node1 CPU(s):                  16-31
    NUMA node2 CPU(s):                  32-47
    NUMA node3 CPU(s):                  48-63
    NUMA node4 CPU(s):                  64-79
    NUMA node5 CPU(s):                  80-95
    NUMA node6 CPU(s):                  96-111
    NUMA node7 CPU(s):                  112-127
    Vulnerability Gather data sampling: Not affected
    Vulnerability Itlb multihit:        Not affected
    Vulnerability L1tf:                 Not affected
    Vulnerability Mds:                  Not affected
    Vulnerability Meltdown:             Not affected
    Vulnerability Mmio stale data:      Not affected
    Vulnerability Retbleed:             Not affected
    Vulnerability Spec store bypass:    Mitigation; Speculative Store Bypass disabled via prctl and seccomp
    Vulnerability Spectre v1:           Mitigation; usercopy/swapgs barriers and __user pointer sanitization
    Vulnerability Spectre v2:           Mitigation; Retpolines, IBPB conditional, IBRS_FW, STIBP disabled, RSB filling, PBRSB-eIBRS Not affected
    Vulnerability Srbds:                Not affected
    Vulnerability Tsx async abort:      Not affected
    Flags:                              fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3
                                         fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat
                                        _l3 cdp_l3 invpcid_single hw_pstate ssbd mba ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local 
                                        clzero irperf xsaveerptr wbnoinvd arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold v_vmsave_vmload vgif umip pku ospke vaes vpclmulqdq rdpid overflow_recov succor smca sme sev sev_es
Bonjourz commented 8 months ago

Hi @linguohua , thanks for your interest for HyperEnclave.

The code you have commented out:

    preempt_disable();

    header->online_cpus = num_online_cpus();
    atomic_set(&call_done, 0);
    on_each_cpu(enter_hypervisor, header, 0);
    while (atomic_read(&call_done) != num_online_cpus())
        cpu_relax();

    preempt_enable();

is the key operations to start HyperEnclave on the platform. So it is natural that you get

Initialize CMRM fail

in the log shown by dmesg.

In the case you provide, we may encounter some failures when starting HyperEnclave, but there is no dmesg log at early HyperEnclave initialization......(We are working in the progress to support such feature now). Currently we retrieve these information (for debug) from the output of Serial Port. It seems that the platform is a server, so could you please show the output of the Serial Port when you start HyperEnclave? (We usually get the output of Serial Port by typing the following command):

ipmitool -I lanplus -H [IP address] -U admin -P admin sol activate
linguohua commented 8 months ago

Thanks for your reply.

My server is located in an IDC, far away from my work site, so it is not easy to get the log out from serial port, but I will try my best to get it later.

More information about the 'hang on' is, our enclave-driver hang on the following line: https://github.com/HyperEnclave/hyperenclave-driver/blob/0af18c3ea3a3935fd9bd5d08b901ff6a288b161d/driver/main.c#L158C20-L158C20

        err = entry(cpu);

When this line been commented out, enclave-driver can go ahead(with 'Initialize CMRM fail' finally). It seems that the entry of rust-monitor has some issue which stuck the progress.

Bonjourz commented 8 months ago

Could you please show the complete dmesg log with the following line commented out:

                err = entry(cpu);

Maybe we can get more information about your platform from the log.

linguohua commented 8 months ago

Hi, thanks for your reply and sorry for my delay.

I comment out the following lines in function enter_hypervisor: https://github.com/HyperEnclave/hyperenclave-driver/blob/0af18c3ea3a3935fd9bd5d08b901ff6a288b161d/driver/main.c#L154C1-L154C1:

void enter_hypervisor(void *info)
{
        //struct hyper_header *header = info;
        //unsigned int cpu = smp_processor_id();
        //int (*entry)(unsigned int);
        int err;

        //entry = header->entry + (unsigned long)hypervisor_mem;

        //if (cpu < header->max_cpus)
                /* either returns 0 or the same error code across all CPUs */
        //      err = entry(cpu);
        //else
        //      err = -EINVAL;
        err = 0;
        if (err)
                error_code = err;

#if defined(CONFIG_X86) && LINUX_VERSION_CODE >= KERNEL_VERSION(4, 0, 0)
        /* on Intel, VMXE is now on - update the shadow */
        cr4_init_shadow();
#endif

        atomic_inc(&call_done);
}

and assign err to zero.

The dmesg logs:

[200625.628507] HE: cpu_vendor_detect: 39. Vendor ID: AuthenticAMD
[200625.643864] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000000000000000 -> 0x00000000000a0000], type: System RAM
[200625.643866] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000000a0000 -> 0x0000000000100000], type: Reserved
[200625.643867] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000000000100000 -> 0x0000000030000000], type: System RAM
[200625.643868] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000000030000000 -> 0x0000000030047000], type: ACPI Non-volatile Storage
[200625.643869] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000000030047000 -> 0x0000000075cf0000], type: System RAM
[200625.643870] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000000075cf0000 -> 0x0000000076000000], type: Reserved
[200625.643871] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000000076000000 -> 0x00000000a60b6000], type: System RAM
[200625.643872] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000a60b6000 -> 0x00000000a820f000], type: Reserved
[200625.643873] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000a820f000 -> 0x00000000a83c9000], type: ACPI Tables
[200625.643874] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000a83c9000 -> 0x00000000a88d4000], type: ACPI Non-volatile Storage
[200625.643874] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000a88d4000 -> 0x00000000a97ff000], type: Reserved
[200625.643875] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000a97ff000 -> 0x00000000ac000000], type: System RAM
[200625.643876] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000ac000000 -> 0x00000000b0000000], type: Reserved
[200625.643877] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000b4280000 -> 0x00000000b4281000], type: Reserved
[200625.643878] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000b5180000 -> 0x00000000b5181000], type: Reserved
[200625.643878] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000be180000 -> 0x00000000be181000], type: Reserved
[200625.643879] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000bf200000 -> 0x00000000bf301000], type: Reserved
[200625.643880] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000c8180000 -> 0x00000000c8181000], type: Reserved
[200625.643881] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000c9100000 -> 0x00000000c9200000], type: Reserved
[200625.643881] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000c9300000 -> 0x00000000c9401000], type: Reserved
[200625.643882] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000c9500000 -> 0x00000000c9600000], type: Reserved
[200625.643883] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000f4180000 -> 0x00000000f4181000], type: Reserved
[200625.643884] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000f5180000 -> 0x00000000f5181000], type: Reserved
[200625.643884] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000fea00000 -> 0x00000000feb00000], type: Reserved
[200625.643885] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000fec00000 -> 0x00000000fec01000], type: Reserved
[200625.643886] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000fec10000 -> 0x00000000fec11000], type: Reserved
[200625.643887] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000fed00000 -> 0x00000000fed01000], type: Reserved
[200625.643887] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000fed40000 -> 0x00000000fed45000], type: Reserved
[200625.643888] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000fed80000 -> 0x00000000fed90000], type: Reserved
[200625.643889] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000fedc0000 -> 0x00000000fedc1000], type: Reserved
[200625.643890] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000fedc2000 -> 0x00000000fedc9000], type: Reserved
[200625.643890] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000fee00000 -> 0x00000000fef00000], type: Reserved
[200625.643891] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000ff000000 -> 0x0000000100000000], type: Reserved
[200625.643892] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000000100000000 -> 0x000000404fe00000], type: System RAM
[200625.643893] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x000000404fe00000 -> 0x0000004050000000], type: Reserved
[200625.643894] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000004050000000 -> 0x000000804ff00000], type: System RAM
[200625.643894] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x000000804ff00000 -> 0x0000008050000000], type: Reserved
[200625.643895] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000008050000000 -> 0x000000c04ff00000], type: System RAM
[200625.643896] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x000000c04ff00000 -> 0x000000c050000000], type: Reserved
[200625.643897] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x000000fd00000000 -> 0x0000010000000000], type: Reserved
[200625.643898] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000010000000000 -> 0x0000013fff300000], type: System RAM
[200625.643899] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000013fff300000 -> 0x0000014000000000], type: Reserved
[200625.643899] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000014000000000 -> 0x0000017ffff00000], type: System RAM
[200625.643900] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000017ffff00000 -> 0x0000018000000000], type: Reserved
[200625.643901] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000018000000000 -> 0x000001bffff00000], type: System RAM
[200625.643902] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x000001bffff00000 -> 0x000001c000000000], type: Reserved
[200625.643902] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x000001c000000000 -> 0x000001fffff00000], type: System RAM
[200625.643903] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x000001fffff00000 -> 0x0000020000000000], type: Reserved
[200625.643904] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000020000000000 -> 0x0000023ffff00000], type: System RAM
[200625.643905] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000023ffff00000 -> 0x0000024010400000], type: Reserved
[200625.643905] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000034030000000 -> 0x0000034040400000], type: Reserved
[200625.643906] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000034060000000 -> 0x0000034070400000], type: Reserved
[200625.643907] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x000004c090000000 -> 0x000004c0a0400000], type: Reserved
[200625.643908] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x000004c0c0000000 -> 0x000004c0d0400000], type: Reserved
[200625.643908] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x000005c0f0000000 -> 0x000005c100400000], type: Reserved
[200625.643909] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x000005c120000000 -> 0x000005c130400000], type: Reserved
[200625.643910] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x000006c150000000 -> 0x000006c160400000], type: Reserved
[200625.643911] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x000007fc00000000 -> 0x000007fc04000000], type: Reserved
[200625.643914] HE: get_convertible_memory: 213. Convertible Memory[ 0]: 0x0000000000000000 -> 0x00000000000a0000
[200625.643915] HE: get_convertible_memory: 213. Convertible Memory[ 1]: 0x0000000000100000 -> 0x0000000030000000
[200625.643916] HE: get_convertible_memory: 213. Convertible Memory[ 2]: 0x0000000030047000 -> 0x0000000075cf0000
[200625.643917] HE: get_convertible_memory: 213. Convertible Memory[ 3]: 0x0000000076000000 -> 0x00000000a60b6000
[200625.643918] HE: get_convertible_memory: 213. Convertible Memory[ 4]: 0x00000000a97ff000 -> 0x00000000ac000000
[200625.643919] HE: get_convertible_memory: 213. Convertible Memory[ 5]: 0x0000000100000000 -> 0x000000404fe00000
[200625.643919] HE: get_convertible_memory: 213. Convertible Memory[ 6]: 0x0000004050000000 -> 0x000000804ff00000
[200625.643920] HE: get_convertible_memory: 213. Convertible Memory[ 7]: 0x0000008050000000 -> 0x000000c04ff00000
[200625.643921] HE: get_convertible_memory: 213. Convertible Memory[ 8]: 0x0000010000000000 -> 0x0000013fff300000
[200625.643922] HE: get_convertible_memory: 213. Convertible Memory[ 9]: 0x0000014000000000 -> 0x0000017ffff00000
[200625.643923] HE: get_convertible_memory: 213. Convertible Memory[10]: 0x0000018000000000 -> 0x000001bffff00000
[200625.643924] HE: get_convertible_memory: 213. Convertible Memory[11]: 0x000001c000000000 -> 0x000001fffff00000
[200625.643924] HE: get_convertible_memory: 213. Convertible Memory[12]: 0x0000020000000000 -> 0x0000023ffff00000
[200625.643925] HE: get_convertible_memory: 218. Convertible Memory size: 0x1fff7000000
[200625.643927] HE: get_valid_rsrv_mem: 285. Reserved Memory[ 0]: 0x100000000 -> 0x900000000
[200625.643927] HE: get_valid_rsrv_mem: 290. Reserved Memory size: 0x800000000
[200625.643929] HE: get_sme_mask: 68. SME mask: [0x8000000000000]
[200625.645891] HE: mem_test: 48. Memory[0x100000000 - 0x300000000] test begin
[200627.365922] HE: mem_test: 78. Memory[0x100000000 - 0x300000000] test pass
[200627.366262] HE: mem_test: 48. Memory[0x300000000 - 0x500000000] test begin
[200629.872067] HE: mem_test: 78. Memory[0x300000000 - 0x500000000] test pass
[200629.872721] HE: mem_test: 48. Memory[0x500000000 - 0x700000000] test begin
[200632.594300] HE: mem_test: 78. Memory[0x500000000 - 0x700000000] test pass
[200632.595357] HE: mem_test: 48. Memory[0x700000000 - 0x900000000] test begin
[200635.339625] HE: mem_test: 78. Memory[0x700000000 - 0x900000000] test pass
[200635.351065] HE: get_hypervisor_meminfo: 185. HE_WARN. get_hypervisor_meminfo 1 : 0x100000001, 0x800000000
[200635.351067] HE: get_hypervisor_meminfo: 198. HE_WARN. get_hypervisor_meminfo 1: 1
[200635.351069] HE: get_hv_heap_size: 375. Hypervisor heap size: 0x23f800000
[200635.351069] HE: get_hv_cmrm_size: 387. Hypervisor cmrm size: 0x35ffff000
[200635.351070] HE: get_hv_frame_size: 400. Hypervisor frame size: 0xffc00000
[200635.351071] HE: get_hypervisor_size: 413. Hv_core_and_percpu_size: 0x8914000, Hypervisor size: 0x6c0000000
[200635.351072] HE: get_hypervisor_meminfo: 206. HE_WARN. get_hypervisor_meminfo 2r : 0x100000001, 0x800000000
[200635.351073] HE: get_hypervisor_meminfo: 207. HE_WARN. get_hypervisor_meminfo 2h : 0x0, 0x6c0000000
[200635.351073] HE: he_cmd_enable: 302. hypervisor size: 0x6c0000000
[200635.351075] HE: get_sme_mask: 68. SME mask: [0x8000000000000]
[200636.643143] HE: he_cmd_enable: 352. config_size: 2916
[200636.687601] HE: add_epc_pages: 43. total_epc_pages: 0x140000, free_epc_pages: 0x140000
[200636.687603] HE: init_enclave_page: 317. epc ranges: [0x7c0000000-0x900000000], 0x140000000
[200636.687604] HE: init_enclave_page: 333. Initialized EPC ranges size: 0x140000000
[200636.687606] HE: he_cmd_enable: 383. config_header load_addr: 0xffffff0008914000
[200636.687683] HE: he_cmd_enable: 404. mem_region load_addr: 0xffffff0008914124
[200636.687685] HE: inspect_tpm: 206. using fake tpm
[200636.687686] HE: he_cmd_enable: 411. tpm mmio type=8,size=0 pa=ffffffff
[200636.719416] HE: init_cmrm: 444. HE_ERROR. Initialize [0x0 -> 0x1000000000]'s CMRM error, ret: 6
[200636.719444] HE: he_cmd_enable: 474. HE_ERROR. Initialize CMRM fail, err: 6
Bonjourz commented 8 months ago

Hi, @linguohua ,

It seems that there is no problem on the dmesg log. So it is better to have a look on the output of Serial Port.

My server is located in an IDC, far away from my work site, so it is not easy to get the log out from serial port, but I will try my best to get it later.

Actually, the output of Serial Port can be shown remotely, could you please ask the service provider of IDC for more information?

Another thing it needs to be checked is that: What is the type (virtual machine or a bare metal machine) of the platform on which you run HyperEnclave? Currently, we do not support starting HyperEnclave on a virtual machine.

For more information for your platform, could you please provide the output of cpuid for us? By typing cpuid in your shell:

$ sudo apt install cpuid
$ cpuid
linguohua commented 8 months ago

Hi, @Bonjourz , sorry for my delay.

The CPU is AMD 7T83, and platform is bare meta machine.

Actually, the output of Serial Port can be shown remotely, could you please ask the service provider of IDC for more information?

Yes, i could. I will ask IDC for help tomorrow, I will post the log here, along with 'cpuid' informations. Again, thanks for your help.

linguohua commented 8 months ago

Hi, @Bonjourz I have got the log from serial port via IPMI.

When use the origin HyperEnclave code, the log as following:

heap_allocator-bug2

It says that: the index out of bounds, so I change 'buddy_system_allocator' version to '0.8', which I can change 'LockedHeap' bound to 64 instead of 32, in src/memory/heap.rs:

#[cfg_attr(not(test), global_allocator)]
static HEAP_ALLOCATOR: LockedHeap<64> = LockedHeap::<64>::new();

then, it hang again but at different position, the logs:

bitmap_allocator-bug2

ok, now it says that the 'bitmap_allocator' has some problem, and I continue to modify the following codes in src/memory/frame.rs:

// Support max 1M * 4096 = 64GB memory.
type FrameAlloc = bitmap_allocator::BitAlloc16M;

but, it hangs frustratingly, at a new position, the logs: page-error

Maybe the main reason of the problem is, my CPU is amd 7t83, it has 8 numas and 128 cores, for 16 cores per numa, maybe HyperEnclave dose not yet support such many cores.

Another example is, when i use the cmdline that is recommended by 'ReadMe' file of HyperEnclave: memmap=4G\\\$0x100000000, HyperEnclave will failed too. but when I modify it to memmap=32G\\\$0x100000000 , it can go ahead, and finally hang the whole machine, as mentioned above.

Bonjourz commented 8 months ago

Hi @linguohua ,

Thanks for your input. The third picture is a bit blurry, and the information shown is incomplete... could you please show a clearer version which contains more logs of the third picture?

linguohua commented 7 months ago

Hi, @Bonjourz. Sorry again for my delay.

Here is the third pic, with some improvement of quality: page_already_mapped_bug

The content of lines with read color is:

[ERROR][96]
panicked at 'called `Result::unwrap()` on an `Err` value: [src/memory/paging.rs:46:18] Bad address: AlreadyMapped((0, 14076680675840, READ | WRITE | EXECUTE | USER, Size1G))', src/cell.rs:2
Current Cpu: PerCpu {
Bonjourz commented 7 months ago

Hi @linguohua.

Thanks for providing more information for us.

  1. The log you provide
    [ERROR][96]
    panicked at 'called `Result::unwrap()` on an `Err` value: [src/memory/paging.rs:46:18] Bad address: AlreadyMapped((0, 14076680675840, READ | WRITE | EXECUTE | USER, Size1G))', src/cell.rs:2
    Current Cpu: PerCpu {

is still incomplete. The log can show the line of codes which triggers the error: src/cell.rs:2...., but it seems the log is truncated.

Could you please add some log to get which expression in src/cell.rs triggers the error?

  1. From the log you provide, the memory size of your platform is 2TB. Currently, the open-sourced HyperEnclave cannot support it... Could you please change the memory size of your platform to 512 GB?

  2. Another way to support 2TB platform, we need some modifications on the code. But I cannot get accessed to your environment, it is hard to develop it. So here I can only provide some advise. Maybe you can remove the code at https://github.com/HyperEnclave/hyperenclave/blob/master/src/cell.rs#L159C21-L165:

                hvm.insert(MemoryRegion::new_with_offset_mapper(
                    hv_virt_start,
                    region.phys_start as HostPhysAddr,
                    region.size as usize,
                    MemFlags::READ | MemFlags::WRITE,
                ))?;
                // Support hardware encrypt when swap out EPC page to guest RAM
-               #[cfg(feature = "sme")]
-               hvm.insert(MemoryRegion::new_with_offset_mapper(
-                  region.virt_start as HostVirtAddr,
-                  region.phys_start as HostPhysAddr,
-                  region.size as usize,
-                  MemFlags::READ | MemFlags::WRITE | MemFlags::ENCRYPTED,
-               ))?;
                normal_world_mem_region.insert(
                    (region.phys_start as usize)..(region.phys_start + region.size) as usize,
                )?;

and then have a try.

  1. Since HyperEnclave is a low-level software running on the bare-metal machine, so there exist some corner cases we still have not considered yet that HyperEnclave cannot handle for a new platform. Support for TB-level memory is a new feature which is in our plan. And currently, HyperEnclave should inform developers that it cannot support TB-level memory. Thanks for your information to help us discover that. We will get some environment similar to that you provide and have a test.

If you have any updates, feel free to contact us. Looking forward to your reply.

linguohua commented 7 months ago

Hi, @Bonjourz . Thanks for your reply and suggestion.

The log can show the line of codes which triggers the error: src/cell.rs:2...., but it seems the log is truncated.

Here did you mean backtrace? But ‘PanicInfo' have no backtrace. I try to use std::backtrace crate and std::panic:set_hook function, but compile failed for HyperEnclave 'no-std' cfg.

Could you please change the memory size of your platform to 512 GB?

Yes, I use boot cmdline with mem=512G to limit the linux to use 512GB mem only. But, HyperEnclave still hang the whole machine, the difference is, there are no error logs:

512gb

Bonjourz commented 7 months ago

Hi, @linguohua

Thanks agin for your information.

Hi, @Bonjourz . Thanks for your reply and suggestion.

The log can show the line of codes which triggers the error: src/cell.rs:2...., but it seems the log is truncated.

Here did you mean backtrace? But ‘PanicInfo' have no backtrace. I try to use std::backtrace crate and std::panic:set_hook function, but compile failed for HyperEnclave 'no-std' cfg.

I mean the error log can show which line of code in src/cell.rs triggers the error, but the log you provided seem to be truncated. In the log you provided perviously show that the line at 2x or 2xx (I can only see the number of 2...).

Could you please change the memory size of your platform to 512 GB?

Yes, I use boot cmdline with mem=512G to limit the linux to use 512GB mem only. But, HyperEnclave still hang the whole machine, the difference is, there are no error logs:

512gb

Cool! Could you please show the dmesg log and the configuration of the kernel command line again?

linguohua commented 7 months ago

Hi, @Bonjourz Sorry for my delay again. And so many thanks for your patience.

First of all, as mentioned earlier, i have made some changes to HyperEnclave's rust code:

 // Support max 1M * 4096 = 4GB memory.
-type FrameAlloc = bitmap_allocator::BitAlloc1M;
+type FrameAlloc = bitmap_allocator::BitAlloc16M;
 #[cfg_attr(not(test), global_allocator)]
-static HEAP_ALLOCATOR: LockedHeap = LockedHeap::new();
+static HEAP_ALLOCATOR: LockedHeap<64> = LockedHeap::<64>::new();

Then, my kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.4.0-166-generic root=UUID=7eb35854-7835-49d7-8c68-b7eb7e8f296c ro mem=512G memmap=32G$0x100000000 amd_iommu=off intremap=off no5lvl Which use mem=512G to limit linux kernel to use only 512GB mem.

Now, type bash start_hyperenclave.sh, dmesg:

kern  :info  : [Mon Nov 13 14:37:34 2023] HE: cpu_vendor_detect: 39. Vendor ID: AuthenticAMD
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000000000000000 -> 0x00000000000a0000], type: System RAM
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000000a0000 -> 0x0000000000100000], type: Reserved
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000000000100000 -> 0x0000000030000000], type: System RAM
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000000030000000 -> 0x0000000030047000], type: ACPI Non-volatile Storage
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000000030047000 -> 0x0000000075cf0000], type: System RAM
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000000075cf0000 -> 0x0000000076000000], type: Reserved
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000000076000000 -> 0x00000000a60b6000], type: System RAM
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000a60b6000 -> 0x00000000a820f000], type: Reserved
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000a820f000 -> 0x00000000a83c9000], type: ACPI Tables
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000a83c9000 -> 0x00000000a88d4000], type: ACPI Non-volatile Storage
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000a88d4000 -> 0x00000000a97ff000], type: Reserved
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000a97ff000 -> 0x00000000ac000000], type: System RAM
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000ac000000 -> 0x00000000b0000000], type: Reserved
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000b4280000 -> 0x00000000b4281000], type: Reserved
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000b5180000 -> 0x00000000b5181000], type: Reserved
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000be180000 -> 0x00000000be181000], type: Reserved
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000bf200000 -> 0x00000000bf301000], type: Reserved
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000c8180000 -> 0x00000000c8181000], type: Reserved
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000c9100000 -> 0x00000000c9200000], type: Reserved
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000c9300000 -> 0x00000000c9401000], type: Reserved
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000c9500000 -> 0x00000000c9600000], type: Reserved
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000f4180000 -> 0x00000000f4181000], type: Reserved
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000f5180000 -> 0x00000000f5181000], type: Reserved
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000fea00000 -> 0x00000000feb00000], type: Reserved
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000fec00000 -> 0x00000000fec01000], type: Reserved
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000fec10000 -> 0x00000000fec11000], type: Reserved
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000fed00000 -> 0x00000000fed01000], type: Reserved
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000fed40000 -> 0x00000000fed45000], type: Reserved
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000fed80000 -> 0x00000000fed90000], type: Reserved
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000fedc0000 -> 0x00000000fedc1000], type: Reserved
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000fedc2000 -> 0x00000000fedc9000], type: Reserved
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000fee00000 -> 0x00000000fef00000], type: Reserved
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000ff000000 -> 0x0000000100000000], type: Reserved
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000000100000000 -> 0x000000404fe00000], type: System RAM
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x000000404fe00000 -> 0x0000004050000000], type: Reserved
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000004050000000 -> 0x000000804ff00000], type: System RAM
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x000000804ff00000 -> 0x0000008050000000], type: Reserved
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000008050000000 -> 0x000000c04ff00000], type: System RAM
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x000000c04ff00000 -> 0x000000c050000000], type: Reserved
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x000000fd00000000 -> 0x0000010000000000], type: Reserved
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000010000000000 -> 0x0000013fff300000], type: System RAM
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000013fff300000 -> 0x0000014000000000], type: Reserved
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000014000000000 -> 0x0000017ffff00000], type: System RAM
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000017ffff00000 -> 0x0000018000000000], type: Reserved
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000018000000000 -> 0x000001bffff00000], type: System RAM
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x000001bffff00000 -> 0x000001c000000000], type: Reserved
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x000001c000000000 -> 0x000001fffff00000], type: System RAM
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x000001fffff00000 -> 0x0000020000000000], type: Reserved
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000020000000000 -> 0x0000023ffff00000], type: System RAM
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000023ffff00000 -> 0x0000024010400000], type: Reserved
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000034030000000 -> 0x0000034040400000], type: Reserved
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000034060000000 -> 0x0000034070400000], type: Reserved
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x000004c090000000 -> 0x000004c0a0400000], type: Reserved
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x000004c0c0000000 -> 0x000004c0d0400000], type: Reserved
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x000005c0f0000000 -> 0x000005c100400000], type: Reserved
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x000005c120000000 -> 0x000005c130400000], type: Reserved
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x000006c150000000 -> 0x000006c160400000], type: Reserved
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x000007fc00000000 -> 0x000007fc04000000], type: Reserved
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 213. Convertible Memory[ 0]: 0x0000000000000000 -> 0x00000000000a0000
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 213. Convertible Memory[ 1]: 0x0000000000100000 -> 0x0000000030000000
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 213. Convertible Memory[ 2]: 0x0000000030047000 -> 0x0000000075cf0000
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 213. Convertible Memory[ 3]: 0x0000000076000000 -> 0x00000000a60b6000
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 213. Convertible Memory[ 4]: 0x00000000a97ff000 -> 0x00000000ac000000
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 213. Convertible Memory[ 5]: 0x0000000100000000 -> 0x000000404fe00000
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 213. Convertible Memory[ 6]: 0x0000004050000000 -> 0x000000804ff00000
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 213. Convertible Memory[ 7]: 0x0000008050000000 -> 0x000000c04ff00000
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 213. Convertible Memory[ 8]: 0x0000010000000000 -> 0x0000013fff300000
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 213. Convertible Memory[ 9]: 0x0000014000000000 -> 0x0000017ffff00000
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 213. Convertible Memory[10]: 0x0000018000000000 -> 0x000001bffff00000
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 213. Convertible Memory[11]: 0x000001c000000000 -> 0x000001fffff00000
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 213. Convertible Memory[12]: 0x0000020000000000 -> 0x0000023ffff00000
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_convertible_memory: 218. Convertible Memory size: 0x1fff7000000
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_valid_rsrv_mem: 285. Reserved Memory[ 0]: 0x100000000 -> 0x900000000
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_valid_rsrv_mem: 290. Reserved Memory size: 0x800000000
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: get_sme_mask: 68. SME mask: [0x8000000000000]
kern  :info  : [Mon Nov 13 14:37:34 2023] HE: mem_test: 48. Memory[0x100000000 - 0x300000000] test begin
kern  :info  : [Mon Nov 13 14:37:35 2023] HE: mem_test: 78. Memory[0x100000000 - 0x300000000] test pass
kern  :info  : [Mon Nov 13 14:37:35 2023] HE: mem_test: 48. Memory[0x300000000 - 0x500000000] test begin
kern  :info  : [Mon Nov 13 14:37:37 2023] HE: mem_test: 78. Memory[0x300000000 - 0x500000000] test pass
kern  :info  : [Mon Nov 13 14:37:37 2023] HE: mem_test: 48. Memory[0x500000000 - 0x700000000] test begin
kern  :info  : [Mon Nov 13 14:37:40 2023] HE: mem_test: 78. Memory[0x500000000 - 0x700000000] test pass
kern  :info  : [Mon Nov 13 14:37:40 2023] HE: mem_test: 48. Memory[0x700000000 - 0x900000000] test begin
kern  :info  : [Mon Nov 13 14:37:42 2023] HE: mem_test: 78. Memory[0x700000000 - 0x900000000] test pass
kern  :warn  : [Mon Nov 13 14:37:42 2023] HE: get_hypervisor_meminfo: 185. HE_WARN. get_hypervisor_meminfo 1 : 0x100000001, 0x800000000
kern  :warn  : [Mon Nov 13 14:37:42 2023] HE: get_hypervisor_meminfo: 198. HE_WARN. get_hypervisor_meminfo 1: 1
kern  :info  : [Mon Nov 13 14:37:42 2023] HE: get_hv_heap_size: 375. Hypervisor heap size: 0x23f800000
kern  :info  : [Mon Nov 13 14:37:42 2023] HE: get_hv_cmrm_size: 387. Hypervisor cmrm size: 0x35ffff000
kern  :info  : [Mon Nov 13 14:37:42 2023] HE: get_hv_frame_size: 400. Hypervisor frame size: 0xffc00000
kern  :info  : [Mon Nov 13 14:37:42 2023] HE: get_hypervisor_size: 413. Hv_core_and_percpu_size: 0x8b10000, Hypervisor size: 0x6c0000000
kern  :warn  : [Mon Nov 13 14:37:42 2023] HE: get_hypervisor_meminfo: 206. HE_WARN. get_hypervisor_meminfo 2r : 0x100000001, 0x800000000
kern  :warn  : [Mon Nov 13 14:37:42 2023] HE: get_hypervisor_meminfo: 207. HE_WARN. get_hypervisor_meminfo 2h : 0x0, 0x6c0000000
kern  :info  : [Mon Nov 13 14:37:42 2023] HE: he_cmd_enable: 302. hypervisor size: 0x6c0000000
kern  :info  : [Mon Nov 13 14:37:42 2023] HE: get_sme_mask: 68. SME mask: [0x8000000000000]
kern  :info  : [Mon Nov 13 14:37:43 2023] HE: he_cmd_enable: 352. config_size: 2724
kern  :info  : [Mon Nov 13 14:37:43 2023] HE: add_epc_pages: 43. total_epc_pages: 0x140000, free_epc_pages: 0x140000
kern  :info  : [Mon Nov 13 14:37:43 2023] HE: init_enclave_page: 317. epc ranges: [0x7c0000000-0x900000000], 0x140000000
kern  :info  : [Mon Nov 13 14:37:43 2023] HE: init_enclave_page: 333. Initialized EPC ranges size: 0x140000000
kern  :info  : [Mon Nov 13 14:37:43 2023] HE: he_cmd_enable: 383. config_header load_addr: 0xffffff0008b10000
kern  :info  : [Mon Nov 13 14:37:43 2023] HE: he_cmd_enable: 404. mem_region load_addr: 0xffffff0008b10124
kern  :info  : [Mon Nov 13 14:37:43 2023] HE: inspect_tpm: 206. using fake tpm
kern  :info  : [Mon Nov 13 14:37:43 2023] HE: he_cmd_enable: 411. tpm mmio type=8,size=0 pa=ffffffff

IPMI Serial port logs: 512g

The logs above are results when kernel runs with 512GB mem, and finally the system is hanged.

Following are results when kernel runs with 2TB mem. Kernel cmdline: BOOT_IMAGE=/boot/vmlinuz-5.4.0-166-generic root=UUID=7eb35854-7835-49d7-8c68-b7eb7e8f296c ro memmap=32G$0x100000000 amd_iommu=off intremap=off no5lvl

Dmesg:

kern  :info  : [Mon Nov 13 15:15:46 2023] HE: cpu_vendor_detect: 39. Vendor ID: AuthenticAMD
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000000000000000 -> 0x00000000000a0000], type: System RAM
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000000a0000 -> 0x0000000000100000], type: Reserved
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000000000100000 -> 0x0000000030000000], type: System RAM
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000000030000000 -> 0x0000000030047000], type: ACPI Non-volatile Storage
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000000030047000 -> 0x0000000075cf0000], type: System RAM
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000000075cf0000 -> 0x0000000076000000], type: Reserved
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000000076000000 -> 0x00000000a60b6000], type: System RAM
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000a60b6000 -> 0x00000000a820f000], type: Reserved
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000a820f000 -> 0x00000000a83c9000], type: ACPI Tables
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000a83c9000 -> 0x00000000a88d4000], type: ACPI Non-volatile Storage
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000a88d4000 -> 0x00000000a97ff000], type: Reserved
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000a97ff000 -> 0x00000000ac000000], type: System RAM
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000ac000000 -> 0x00000000b0000000], type: Reserved
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000b4280000 -> 0x00000000b4281000], type: Reserved
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000b5180000 -> 0x00000000b5181000], type: Reserved
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000be180000 -> 0x00000000be181000], type: Reserved
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000bf200000 -> 0x00000000bf301000], type: Reserved
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000c8180000 -> 0x00000000c8181000], type: Reserved
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000c9100000 -> 0x00000000c9200000], type: Reserved
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000c9300000 -> 0x00000000c9401000], type: Reserved
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000c9500000 -> 0x00000000c9600000], type: Reserved
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000f4180000 -> 0x00000000f4181000], type: Reserved
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000f5180000 -> 0x00000000f5181000], type: Reserved
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000fea00000 -> 0x00000000feb00000], type: Reserved
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000fec00000 -> 0x00000000fec01000], type: Reserved
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000fec10000 -> 0x00000000fec11000], type: Reserved
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000fed00000 -> 0x00000000fed01000], type: Reserved
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000fed40000 -> 0x00000000fed45000], type: Reserved
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000fed80000 -> 0x00000000fed90000], type: Reserved
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000fedc0000 -> 0x00000000fedc1000], type: Reserved
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000fedc2000 -> 0x00000000fedc9000], type: Reserved
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000fee00000 -> 0x00000000fef00000], type: Reserved
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x00000000ff000000 -> 0x0000000100000000], type: Reserved
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000000100000000 -> 0x000000404fe00000], type: System RAM
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x000000404fe00000 -> 0x0000004050000000], type: Reserved
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000004050000000 -> 0x000000804ff00000], type: System RAM
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x000000804ff00000 -> 0x0000008050000000], type: Reserved
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000008050000000 -> 0x000000c04ff00000], type: System RAM
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x000000c04ff00000 -> 0x000000c050000000], type: Reserved
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x000000fd00000000 -> 0x0000010000000000], type: Reserved
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000010000000000 -> 0x0000013fff300000], type: System RAM
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000013fff300000 -> 0x0000014000000000], type: Reserved
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000014000000000 -> 0x0000017ffff00000], type: System RAM
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000017ffff00000 -> 0x0000018000000000], type: Reserved
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000018000000000 -> 0x000001bffff00000], type: System RAM
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x000001bffff00000 -> 0x000001c000000000], type: Reserved
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x000001c000000000 -> 0x000001fffff00000], type: System RAM
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x000001fffff00000 -> 0x0000020000000000], type: Reserved
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000020000000000 -> 0x0000023ffff00000], type: System RAM
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000023ffff00000 -> 0x0000024010400000], type: Reserved
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000034030000000 -> 0x0000034040400000], type: Reserved
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x0000034060000000 -> 0x0000034070400000], type: Reserved
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x000004c090000000 -> 0x000004c0a0400000], type: Reserved
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x000004c0c0000000 -> 0x000004c0d0400000], type: Reserved
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x000005c0f0000000 -> 0x000005c100400000], type: Reserved
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x000005c120000000 -> 0x000005c130400000], type: Reserved
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x000006c150000000 -> 0x000006c160400000], type: Reserved
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 136. BIOS E820 table from firmware: [0x000007fc00000000 -> 0x000007fc04000000], type: Reserved
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 213. Convertible Memory[ 0]: 0x0000000000000000 -> 0x00000000000a0000
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 213. Convertible Memory[ 1]: 0x0000000000100000 -> 0x0000000030000000
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 213. Convertible Memory[ 2]: 0x0000000030047000 -> 0x0000000075cf0000
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 213. Convertible Memory[ 3]: 0x0000000076000000 -> 0x00000000a60b6000
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 213. Convertible Memory[ 4]: 0x00000000a97ff000 -> 0x00000000ac000000
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 213. Convertible Memory[ 5]: 0x0000000100000000 -> 0x000000404fe00000
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 213. Convertible Memory[ 6]: 0x0000004050000000 -> 0x000000804ff00000
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 213. Convertible Memory[ 7]: 0x0000008050000000 -> 0x000000c04ff00000
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 213. Convertible Memory[ 8]: 0x0000010000000000 -> 0x0000013fff300000
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 213. Convertible Memory[ 9]: 0x0000014000000000 -> 0x0000017ffff00000
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 213. Convertible Memory[10]: 0x0000018000000000 -> 0x000001bffff00000
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 213. Convertible Memory[11]: 0x000001c000000000 -> 0x000001fffff00000
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 213. Convertible Memory[12]: 0x0000020000000000 -> 0x0000023ffff00000
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_convertible_memory: 218. Convertible Memory size: 0x1fff7000000
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_valid_rsrv_mem: 285. Reserved Memory[ 0]: 0x100000000 -> 0x900000000
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_valid_rsrv_mem: 290. Reserved Memory size: 0x800000000
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: get_sme_mask: 68. SME mask: [0x8000000000000]
kern  :info  : [Mon Nov 13 15:15:46 2023] HE: mem_test: 48. Memory[0x100000000 - 0x300000000] test begin
kern  :info  : [Mon Nov 13 15:15:47 2023] HE: mem_test: 78. Memory[0x100000000 - 0x300000000] test pass
kern  :info  : [Mon Nov 13 15:15:47 2023] HE: mem_test: 48. Memory[0x300000000 - 0x500000000] test begin
kern  :info  : [Mon Nov 13 15:15:50 2023] HE: mem_test: 78. Memory[0x300000000 - 0x500000000] test pass
kern  :info  : [Mon Nov 13 15:15:50 2023] HE: mem_test: 48. Memory[0x500000000 - 0x700000000] test begin
kern  :info  : [Mon Nov 13 15:15:53 2023] HE: mem_test: 78. Memory[0x500000000 - 0x700000000] test pass
kern  :info  : [Mon Nov 13 15:15:53 2023] HE: mem_test: 48. Memory[0x700000000 - 0x900000000] test begin
kern  :info  : [Mon Nov 13 15:15:56 2023] HE: mem_test: 78. Memory[0x700000000 - 0x900000000] test pass
kern  :warn  : [Mon Nov 13 15:15:56 2023] HE: get_hypervisor_meminfo: 185. HE_WARN. get_hypervisor_meminfo 1 : 0x100000001, 0x800000000
kern  :warn  : [Mon Nov 13 15:15:56 2023] HE: get_hypervisor_meminfo: 198. HE_WARN. get_hypervisor_meminfo 1: 1
kern  :info  : [Mon Nov 13 15:15:56 2023] HE: get_hv_heap_size: 375. Hypervisor heap size: 0x23f800000
kern  :info  : [Mon Nov 13 15:15:56 2023] HE: get_hv_cmrm_size: 387. Hypervisor cmrm size: 0x35ffff000
kern  :info  : [Mon Nov 13 15:15:56 2023] HE: get_hv_frame_size: 400. Hypervisor frame size: 0xffc00000
kern  :info  : [Mon Nov 13 15:15:56 2023] HE: get_hypervisor_size: 413. Hv_core_and_percpu_size: 0x8b10000, Hypervisor size: 0x6c0000000
kern  :warn  : [Mon Nov 13 15:15:56 2023] HE: get_hypervisor_meminfo: 206. HE_WARN. get_hypervisor_meminfo 2r : 0x100000001, 0x800000000
kern  :warn  : [Mon Nov 13 15:15:56 2023] HE: get_hypervisor_meminfo: 207. HE_WARN. get_hypervisor_meminfo 2h : 0x0, 0x6c0000000
kern  :info  : [Mon Nov 13 15:15:56 2023] HE: he_cmd_enable: 302. hypervisor size: 0x6c0000000
kern  :info  : [Mon Nov 13 15:15:56 2023] HE: get_sme_mask: 68. SME mask: [0x8000000000000]
kern  :info  : [Mon Nov 13 15:15:57 2023] HE: he_cmd_enable: 352. config_size: 2916
kern  :info  : [Mon Nov 13 15:15:57 2023] HE: add_epc_pages: 43. total_epc_pages: 0x140000, free_epc_pages: 0x140000
kern  :info  : [Mon Nov 13 15:15:57 2023] HE: init_enclave_page: 317. epc ranges: [0x7c0000000-0x900000000], 0x140000000
kern  :info  : [Mon Nov 13 15:15:57 2023] HE: init_enclave_page: 333. Initialized EPC ranges size: 0x140000000
kern  :info  : [Mon Nov 13 15:15:57 2023] HE: he_cmd_enable: 383. config_header load_addr: 0xffffff0008b10000
kern  :info  : [Mon Nov 13 15:15:57 2023] HE: he_cmd_enable: 404. mem_region load_addr: 0xffffff0008b10124
kern  :info  : [Mon Nov 13 15:15:57 2023] HE: inspect_tpm: 206. using fake tpm
kern  :info  : [Mon Nov 13 15:15:57 2023] HE: he_cmd_enable: 411. tpm mmio type=8,size=0 pa=ffffffff

Logs from serial port: 2tb

Bonjourz commented 7 months ago

Hi @linguohua ,

Thanks for providing the information for us.

After have a try on my platform, I'm sorry to tell you that HyperEnclave does not support 2TB memory, even though we limit memory use by kernel in command line: mem=512G.

We are working to add TB-level memory support for HyperEnclave in the near future. So, we can only start HyperEnclave on the platform with memory size less than 1TB currently. Could you please have a try on another platform with 512GB memory. Sorry for that 🙏

Thank you again for help us discover that HyperEnclave could not throw warnings if the configuration of the platform that HyperEnclave cannot adapt to. And we are working to fixing such issue now.

For this issue, we look forward to having a deeper communication and conversation with you 😊. Could you please provide your email for us or contact me: bojun.zhu@foxmail.com?

Looking forward to your reply!

linguohua commented 7 months ago

Hi, @Bonjourz So many thanks for your patience and help.

I will try HyperEnclave on a server with physical 512GB mem, then write a report and mail it to you.

Now I close this issue, and I hope that we can keep going on via emails.

Bonjourz commented 7 months ago

Hi @linguohua , can you successfully try HyperEnclave on machine with 512 GB physical memory now?

If you find that it is difficult to get a new machine with 512GB physical memory, you can also try our workaround patch (It can address the issue that virtual address beyond 1TB will cause address space collision):

$git diff src/cell.rs
diff --git a/src/cell.rs b/src/cell.rs
index 928e1e1..f618087 100644
--- a/src/cell.rs
+++ b/src/cell.rs
@@ -140,29 +140,12 @@ impl Cell {
         for region in sys_config.mem_regions() {
             if region.flags.contains(MemFlags::DMA) {
                 let hv_virt_start = phys_to_virt(region.virt_start as GuestPhysAddr);
-                if hv_virt_start < region.virt_start as GuestPhysAddr {
-                    return hv_result_err!(
-                        EINVAL,
-                        format!(
-                            "Guest physical address {:#x} is too large",
-                            region.virt_start
-                        )
-                    );
-                }
                 hvm.insert(MemoryRegion::new_with_offset_mapper(
                     hv_virt_start,
                     region.phys_start as HostPhysAddr,
                     region.size as usize,
                     MemFlags::READ | MemFlags::WRITE,
                 ))?;
-                // Support hardware encrypt when swap out EPC page to guest RAM
-                #[cfg(feature = "sme")]
-                hvm.insert(MemoryRegion::new_with_offset_mapper(
-                    region.virt_start as HostVirtAddr,
-                    region.phys_start as HostPhysAddr,
-                    region.size as usize,
-                    MemFlags::READ | MemFlags::WRITE | MemFlags::ENCRYPTED,
-                ))?;
                 normal_world_mem_region.insert(
                     (region.phys_start as usize)..(region.phys_start + region.size) as usize,
                 )?;

Here, you only need to apply modifications above on src/cell.rs, and then recompile and install HyperEnclave by:

$ make VENDOR=amd SME=on LOG=warn
$ make VENDOR=amd SME=on LOG=warn install

Then, start HyperEnclave:

$ cd hyperenclave/scripts
$ bash start_hyperenclave.sh