asamy / ksm

A fast, hackable and simple x64 VT-x hypervisor for Windows and Linux. Builtin userspace sandbox and introspection engine.
https://asamy.github.io/ksm/
GNU General Public License v2.0
837 stars 182 forks source link

Debian (9.1) machine freeze after running "sudo ./a.out" #23

Closed CiraciNicolo closed 2 years ago

CiraciNicolo commented 7 years ago

Type of this issue (please specify)

This is a support matter (i.e. your own modified tree). I've removed CPU_DYING because it is not supported anymore.

System information

  1. CPU: Intel (Codename: i5-4278U)
  2. Kernel: Linux
  3. Kernel version: 4.9.0-3-amd64

    Issue description

After running sudo ./a.out the machine freeze, I was able to understand that "ioctl" causes the freeze. Since the machine just become unresponsive there is no logs,

asamy commented 7 years ago

On Thu, Oct 5, 2017 at 5:01 PM Nicolò Ciraci notifications@github.com wrote:

Type of this issue (please specify)

This is a bug in the upstream tree as-is unmodified. System information

  1. CPU: Intel (Codename: i5-4278U)
  2. Kernel: Linux
  3. Kernel version: 4.9.0-3-amd64

Issue description

After running sudo ./a.out the machine freeze, I was able to understand that "ioctl" causes the freeze. Since the machine just become unresponsive there is no logs,

Comment out the code in ept_memory_type() function and make it just return EPT_MT_WRITEBACK, see if that fixes it.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/asamy/ksm/issues/23, or mute the thread https://github.com/notifications/unsubscribe-auth/ABvH5VSUGUzKAAQwZaHpY9kturU14l8Wks5spO7ZgaJpZM4PvOSh .

-- asamy

CiraciNicolo commented 7 years ago

I'm still experiencing the problem, also when compiling I get this warning:

ksm/exit.c: In function ‘vcpu_sync_idt’:
ksm/exit.c:2099:1: warning: the frame size of 4112 bytes is larger than 2048 bytes [-Wframe-larger-than=]
 }
asamy commented 7 years ago

This warning is a false positive. The host stack is 2 4-KByte pages in size. I am not entirely sure what could be the issue, this might be related to #22.

Are you testing on VM or baremetal? What is your RAM capacity?

CiraciNicolo commented 7 years ago

I'm testing on baremetal, 8 GB of RAM. Looking into #22, esoterix's comment can lead somewhere?

asamy commented 7 years ago

No, commit 85a228dffd5d6559879008b22f4fac0dad8fe542 fixes what he pointed out.

asamy commented 7 years ago

Someone reported that disabling some VMCS controls fixes the freeze, but he hasn't pointed out which one is faulty.

Since I can't reproduce this freeze at all myself, can you disable one by one and let me know here which is faulty?

He pointed out removing all bits in req_cpuctl fixes it, but it could be a control in secondary control that is faulty since cpu control is what enables secondary controls.

CiraciNicolo commented 7 years ago

I'm been able to get the stack frame of the crash via tty, we get a GPF. I've attached an image because I can't get the txt log. (sorry for the quality) img_0473

asamy commented 7 years ago

Upload ksmlinux.ko

CiraciNicolo commented 7 years ago

Here we go! ksmlinux.ko.zip

CiraciNicolo commented 7 years ago

I found out that after a while, the machine unfreeze. I don't understand what is happening.

asamy commented 7 years ago

I had this issue before with ept_memory_type, I just decided to comment it out without actually looking at the issue. From the binary you provided, it's not commented out like I suggested you do before, so please do so and let me know here just to confirm.

Have it like this:

#if 0
    int i;
    struct mtrr_range *range;
    u8 type = 0xff;

    for (i = 0; i < k->mtrr_count; ++i) {
        range = &k->mtrr_ranges[i];
        if (!in_bounds(gpa, range->start, range->end))
            continue;

        if (range->fixed || range->type == EPT_MT_UNCACHABLE)
            return range->type;

        if (range->type == EPT_MT_WRITETHROUGH && type == EPT_MT_WRITEBACK)
            type = EPT_MT_WRITETHROUGH;
        else
            type = range->type;
    }

    if (type == 0xff)
        type = k->mtrr_def;

    return type;
#else
    return EPT_MT_WRITEBACK;
#endif
hmkawakami commented 7 years ago

I was having the same issue, and I just found the problem. The MAX_RANGES in mm.h was too small. I had 10 physical memory regions, and MAX_RANGES default value is 8.

CiraciNicolo commented 7 years ago

Actually I commented it but since it didn't fix the issue I decommended it. I tested it right now, and the um.c don't freeze anymore but I get

subvert: Invalid argument
ret: 0xFFFFFFFF

EDIT: I rerun a.out and another freeze

CiraciNicolo commented 7 years ago

I rune again but this time the machine did not froze, and I was able to get this from dmegs:

[  106.265407] ksm: CPU 0: ksm_init: EPT/VPID caps: 0x00000F0106134141
[  106.265418] ksm: CPU 0: ksm_init: 2 physical memory ranges
[  106.265419] ksm: CPU 0: ksm_init: Range: 0x0000000000001000 -> 0x000000000009FBFF
[  106.265419] ksm: CPU 0: ksm_init: Range: 0x0000000000100000 -> 0x000000003FFEBFFF
[  106.265434] ksm: CPU 0: ksm_init: 41 MTRR ranges (0 default type)
[  106.265434] ksm: CPU 0: ksm_init: MTRR Range: 0x0000000000000000 -> 0x000000000000FFFF fixed: 1 type: 6
[  106.265435] ksm: CPU 0: ksm_init: MTRR Range: 0x0000000000010000 -> 0x000000000001FFFF fixed: 1 type: 6
[  106.265436] ksm: CPU 0: ksm_init: MTRR Range: 0x0000000000030000 -> 0x000000000003FFFF fixed: 1 type: 6
[  106.265436] ksm: CPU 0: ksm_init: MTRR Range: 0x0000000000060000 -> 0x000000000006FFFF fixed: 1 type: 6
[  106.265437] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000A0000 -> 0x00000000000AFFFF fixed: 1 type: 6
[  106.265437] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000F0000 -> 0x00000000000FFFFF fixed: 1 type: 6
[  106.265438] ksm: CPU 0: ksm_init: MTRR Range: 0x0000000000150000 -> 0x000000000015FFFF fixed: 1 type: 6
[  106.265438] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000001C0000 -> 0x00000000001CFFFF fixed: 1 type: 6
[  106.265439] ksm: CPU 0: ksm_init: MTRR Range: 0x0000000000080000 -> 0x0000000000083FFF fixed: 1 type: 6
[  106.265439] ksm: CPU 0: ksm_init: MTRR Range: 0x0000000000084000 -> 0x0000000000087FFF fixed: 1 type: 6
[  106.265440] ksm: CPU 0: ksm_init: MTRR Range: 0x000000000008C000 -> 0x000000000008FFFF fixed: 1 type: 6
[  106.265440] ksm: CPU 0: ksm_init: MTRR Range: 0x0000000000098000 -> 0x000000000009BFFF fixed: 1 type: 6
[  106.265442] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000A8000 -> 0x00000000000ABFFF fixed: 1 type: 6
[  106.265443] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000BC000 -> 0x00000000000BFFFF fixed: 1 type: 6
[  106.265443] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000D4000 -> 0x00000000000D7FFF fixed: 1 type: 6
[  106.265444] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000F0000 -> 0x00000000000F3FFF fixed: 1 type: 6
[  106.265444] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000C0000 -> 0x00000000000C0FFF fixed: 1 type: 5
[  106.265445] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000C1000 -> 0x00000000000C1FFF fixed: 1 type: 5
[  106.265445] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000C3000 -> 0x00000000000C3FFF fixed: 1 type: 5
[  106.265446] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000C6000 -> 0x00000000000C6FFF fixed: 1 type: 5
[  106.265446] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000CA000 -> 0x00000000000CAFFF fixed: 1 type: 5
[  106.265447] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000CF000 -> 0x00000000000CFFFF fixed: 1 type: 5
[  106.265447] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000D5000 -> 0x00000000000D5FFF fixed: 1 type: 5
[  106.265448] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000DC000 -> 0x00000000000DCFFF fixed: 1 type: 5
[  106.265448] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000C0000 -> 0x00000000000C0FFF fixed: 1 type: 5
[  106.265449] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000C9000 -> 0x00000000000C9FFF fixed: 1 type: 5
[  106.265449] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000D3000 -> 0x00000000000D3FFF fixed: 1 type: 5
[  106.265450] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000DE000 -> 0x00000000000DEFFF fixed: 1 type: 5
[  106.265450] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000EA000 -> 0x00000000000EAFFF fixed: 1 type: 5
[  106.265451] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000F7000 -> 0x00000000000F7FFF fixed: 1 type: 5
[  106.265451] ksm: CPU 0: ksm_init: MTRR Range: 0x0000000000105000 -> 0x0000000000105FFF fixed: 1 type: 5
[  106.265452] ksm: CPU 0: ksm_init: MTRR Range: 0x0000000000114000 -> 0x0000000000114FFF fixed: 1 type: 5
[  106.265452] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000C0000 -> 0x00000000000C0FFF fixed: 1 type: 5
[  106.265453] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000D1000 -> 0x00000000000D1FFF fixed: 1 type: 5
[  106.265453] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000E3000 -> 0x00000000000E3FFF fixed: 1 type: 5
[  106.265454] ksm: CPU 0: ksm_init: MTRR Range: 0x00000000000F6000 -> 0x00000000000F6FFF fixed: 1 type: 5
[  106.265454] ksm: CPU 0: ksm_init: MTRR Range: 0x000000000010A000 -> 0x000000000010AFFF fixed: 1 type: 5
[  106.265455] ksm: CPU 0: ksm_init: MTRR Range: 0x000000000011F000 -> 0x000000000011FFFF fixed: 1 type: 5
[  106.265455] ksm: CPU 0: ksm_init: MTRR Range: 0x0000000000135000 -> 0x0000000000135FFF fixed: 1 type: 5
[  106.265456] ksm: CPU 0: ksm_init: MTRR Range: 0x000000000014C000 -> 0x000000000014CFFF fixed: 1 type: 5
[  106.265456] ksm: CPU 0: ksm_init: MTRR Range: 0x0000000000000000 -> 0x000000003FFFFFFF fixed: 0 type: 0
[  106.265470] ksm: CPU 0: ksm_start: Major: 248
[  106.266087] ksm: CPU 0: ksm_start: ready
[  113.046346] ksm: CPU 0: ksm_open: open() from a.out
[  113.046349] ksm: CPU 0: ksm_ioctl: ioctl from a.out: cmd(0x00004B02)
[  113.053502] ksm: CPU 0: vcpu_run: 1: something went wrong: 12
[  113.053524] BUG: unable to handle kernel paging request at fffffffffffffff3
[  113.053565] IP: [<ffffffffc04d571c>] __vmx_vminit+0x4a/0x52 [ksmlinux]
[  113.053592] PGD 27a0a067 
[  113.053599] PUD 27a0c067 
[  113.053615] PMD 0 

[  113.053630] Oops: 0000 [#1] SMP
[  113.053649] Modules linked in: ksmlinux(O) fuse usblp prl_fs_freeze(PO) prl_fs(PO) prl_eth(PO) x86_pkg_temp_thermal coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_rapl_perf uvcvideo videobuf2_vmalloc evdev videobuf2_memops snd_intel8x0 serio_raw videobuf2_v4l2 snd_ac97_codec pcspkr ac97_bus videobuf2_core snd_pcm videodev snd_timer snd soundcore media lpc_ich sg mfd_core shpchp pvpanic prl_tg(PO) virtio_balloon sbs sbshc binfmt_misc battery ac button parport_pc ppdev lp parport ip_tables x_tables autofs4 ext4 crc16 jbd2 crc32c_generic fscrypto ecb mbcache sd_mod sr_mod cdrom ata_generic crc32c_intel virtio_net aesni_intel aes_x86_64 glue_helper lrw gf128mul ablk_helper cryptd ata_piix ahci libahci psmouse i2c_i801 i2c_smbus libata scsi_mod xhci_pci xhci_hcd
[  113.053934]  uhci_hcd ehci_pci ehci_hcd usbcore usb_common virtio_pci virtio_ring virtio
[  113.053965] CPU: 0 PID: 2689 Comm: a.out Tainted: P           O    4.9.0-4-amd64 #1 Debian 4.9.51-1
[  113.054001] Hardware name: Parallels Software International Inc. Parallels Virtual Platform/Parallels Virtual Platform, BIOS 12.2.0 (41591) 04/03/2017
[  113.054044] task: ffff9c037ae7f040 task.stack: ffffac4400b70000
[  113.054064] RIP: 0010:[<ffffffffc04d571c>]  [<ffffffffc04d571c>] __vmx_vminit+0x4a/0x52 [ksmlinux]
[  113.054101] RSP: 0018:ffffac4400b73d90  EFLAGS: 00010046
[  113.054121] RAX: 0000000000000000 RBX: 000000000000003a RCX: 0000000000000000
[  113.054144] RDX: 0000000000000000 RSI: 00000000fee00037 RDI: ffff9c034aa01000
[  113.054166] RBP: 0000000000000000 R08: 00000000fee00030 R09: 0000fffffffff000
[  113.054189] R10: 0000000000000000 R11: 0000000000000001 R12: ffff9c034aa00000
[  113.054211] R13: ffff9c034aa01000 R14: 0000000000000000 R15: 0000000000000000
[  113.054234] FS:  00007f23714c5700(0000) GS:ffff9c037de00000(0000) knlGS:0000000000000000
[  113.054258] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  113.054278] CR2: fffffffffffffff3 CR3: 00000000399ac000 CR4: 00000000001426f0
[  113.054301] Stack:
[  113.054314]  ffffffffc04d1a4b 0000000000000202 ffffac4400b73e08 ffff9c034aa00000
[  113.054341]  0000000000000000 0000000000000000 ffffffffc04d1b2a ffffffffbd6f96e8
[  113.054368]  ffffac4400b73e58 ffff9c037ae7f680 0000000000000000 ffffffffc04d1b20
[  113.054395] Call Trace:
[  113.054413]  [<ffffffffc04d1a4b>] ? __ksm_init_cpu+0xab/0x180 [ksmlinux]
[  113.054439]  [<ffffffffc04d1b2a>] ? __percpu___call_init+0xa/0x20 [ksmlinux]
[  113.054466]  [<ffffffffbd6f96e8>] ? generic_exec_single+0x98/0x100
[  113.054488]  [<ffffffffc04d1b20>] ? __ksm_init_cpu+0x180/0x180 [ksmlinux]
[  113.054511]  [<ffffffffbd6f9818>] ? smp_call_function_single+0xc8/0x130
[  113.054534]  [<ffffffffbd77af2e>] ? printk+0x57/0x73
[  113.054554]  [<ffffffffc04d1b7a>] ? ksm_subvert+0x3a/0x70 [ksmlinux]
[  113.054577]  [<ffffffffc04d54b3>] ? ksm_ioctl+0x2c3/0x40e [ksmlinux]
[  113.054601]  [<ffffffffbd816f1f>] ? do_vfs_ioctl+0x9f/0x600
[  113.054621]  [<ffffffffbd8174f4>] ? SyS_ioctl+0x74/0x80
[  113.055119]  [<ffffffffbdc085bb>] ? system_call_fast_compare_end+0xc/0x9b
[  113.055592] Code: 48 ba 24 57 4d c0 ff ff ff ff e8 00 e2 ff ff 58 59 5a 5b 48 83 c4 08 5d 5e 5f 41 58 41 59 41 5a 41 5b 41 5c 41 5d 41 5e 41 5f 9d <8b> 04 25 f3 ff ff ff c3 58 59 5a 5b 48 83 c4 08 5d 5e 5f 41 58 
[  113.057140] RIP  [<ffffffffc04d571c>] __vmx_vminit+0x4a/0x52 [ksmlinux]
[  113.058172]  RSP <ffffac4400b73d90>
[  113.059476] CR2: fffffffffffffff3
[  113.060388] ---[ end trace a4e8d77f6429cfff ]---
[  113.063077] ksm: CPU 0: ksm_release: release() from a.out
asamy commented 7 years ago

Set EPTP_INIT_USED (ksm.h) to 1, see if that fixes it.

CiraciNicolo commented 7 years ago

Still freezes

asamy commented 7 years ago

That physical memory range output is weird when you said you have 8 GB of RAM. Are you using VM now or something?

Maybe it's not pre-allocating physical RAM like it should, so it's getting a lot of EPT violations to allocate them and that causes the freeze. Maybe the code that gets the physical memory ranges is faulty...

Regardless, assuming those are the physical memory ranges you have (i.e. the output matches the actual ranges), then those are not enough.

CiraciNicolo commented 7 years ago

Yeah, now I'm using a VM so I don't have to reboot every time the machine freeze.

asamy commented 7 years ago

So, I tested today and I haven't been able to reproduce. Both on VM and baremetal (Both Windows 10 & Linux 4.13.8-1), the only difference is my CPU is an i7-5550U (Broadwell).

Have you been able to find some other clue other than the double crash? Can you disable features until you find something out of ordinary?

CrazyHarb commented 2 years ago

Good morning sir, I've gotten the same 'freeze' issue on 'Ubuntu 16.04.1' (kernel version is '4.15.0-29-generic'), when I'm trying to run 'sudo ./a.out', the VM will be froze.

BUT, I've found something interesting out of the blue:

  1. Another code also freeze.
  2. code can entry VMX-host, but not always.

so, I guess that maybe the code has been swapped to disk. My VM memory range is 2GB.