asamy / ksm

A fast, hackable and simple x64 VT-x hypervisor for Windows and Linux. Builtin userspace sandbox and introspection engine.
https://asamy.github.io/ksm/
GNU General Public License v2.0
834 stars 180 forks source link

Create a separate PML4 table for the Host #13

Closed asamy closed 7 years ago

asamy commented 7 years ago

Perhaps even completely create new tables for the Host, e.g. GDT, IDT, LAPIC, etc. This would make it a lot better and flexible, we could even use it as a boot time hypervisor that way (e.g. start before the kernel does).

This is quite a low priority issue, but I will play with this sometime later for fun.

zhexwang commented 7 years ago

Hi, are you looking for host CR3?

zhexwang commented 7 years ago

I had solved this problem. You could borrow the insmod process's page table as host page table. And then you need "atomic_inc(&current->active_mm->mm_count);" [current task is insmod process]. When you switch off the hypervisor, you need decrease this "mm_count"!

asamy commented 7 years ago

That's a nice fix, thanks for that. If you can provide a patch that'd be extremely helpful.

zhexwang commented 7 years ago

I am sorry that my hypervisor is not based on you. My hypervisor is only for Linux/x86_64.

zhexwang commented 7 years ago

But I had many experience to transplant your ksm to Linux platform. I don't know how to help you~

asamy commented 7 years ago

What I mean is, just merely editing code locally with a cloned local copy of KSM then providing a patch based on your locally committed changes, so you'd also get credit since you'll author the commit and I will apply it as the committer. But that's perfectly up to you, then I'll just provide a link in my commit later tonight.

I don't really know what you can help with, perhaps that's actually your call here? Since you probably know what the defects are, etc, so if you found something wrong with any part of KSM, then submitting an issue is extremely helpful, providing a patch directly (or a pull request) is far better, obviously.

See here for more information on how you can help.

zhexwang commented 7 years ago

Got it. Thank you very much. I will try to do this...

asamy commented 7 years ago

I tried your hack, but it did not seem to work, my machine just hung up. Using kernel 4.8.13.

zhexwang commented 7 years ago

May I see your code? I had test it on kernel 3.2 and 3.8. Did you upload your kernel module by using insmod command?

asamy commented 7 years ago

Sure... Here's diff off the upstream tree:

diff --git a/ksm.c b/ksm.c
index 2404912..fb12ba8 100644
--- a/ksm.c
+++ b/ksm.c
@@ -123,6 +123,7 @@ int __ksm_init_cpu(struct ksm *k)
    /* Required MSR_IA32_FEATURE_CONTROL bits:  */
    u64 required_feat_bits = FEATURE_CONTROL_LOCKED |
        FEATURE_CONTROL_VMXON_ENABLED_OUTSIDE_SMX;
+
 #ifdef __linux__
    if (tboot_enabled())
        required_feat_bits |= FEATURE_CONTROL_VMXON_ENABLED_INSIDE_SMX;
@@ -145,14 +146,13 @@ int __ksm_init_cpu(struct ksm *k)
            return ERR_NOMEM;
        }

-       k->kernel_cr3 = __readcr3();
+       k->origin_cr3 = __readcr3();
        u8 ret = __vmx_vminit(vcpu);
        VCPU_DEBUG("Started: %d\n", !ret);

        if (ret == 0)
            k->active_vcpus++, vcpu->subverted = true;
-       else
-           /* vcpu_run() failed, cleanup:  */
+       else    /* vcpu_run() failed, cleanup:  */
            vcpu_free(vcpu);
        return ret;
 #ifndef __GNUC__
@@ -212,9 +212,6 @@ int ksm_init(void)
     */
    __stosq((u64 *)&ksm, 0, sizeof(ksm) >> 3);

-   /* Caller cr3 (could be user)  */
-   ksm.origin_cr3 = __readcr3();
-
 #ifdef EPAGE_HOOK
    htable_init(&ksm.ht, rehash, NULL);
 #endif
diff --git a/main_linux.c b/main_linux.c
index 0caa7d1..431cf92 100644
--- a/main_linux.c
+++ b/main_linux.c
@@ -25,22 +25,10 @@

 #include "ksm.h"

-/*
- * FIXME: Get rid of this work queue stuff.
- * Currently they are just a workaround since init_mm / init_task /
- * init_level4_pgd aren't exported, so we need some way to hack some resident
- * CR3 which is kworker in this case...  Rather than using insmod/modprobe's
- * CR3 which will die eventually.
- */
-static void ksm_worker(struct work_struct *);
-static struct workqueue_struct *wq;
-static DECLARE_DELAYED_WORK(work, ksm_worker);
-
 static inline void do_cpu(void *v)
 {
    int (*f) (struct ksm *) = v;
-   VCPU_DEBUG("On CPU calling %p\n", f);
-   f(&ksm);
+   VCPU_DEBUG("On CPU %p: %d\n", f, f(&ksm));
 }

 static int cpu_callback(struct notifier_block *nfb, unsigned long action, void *hcpu)
@@ -63,38 +51,35 @@ static int cpu_callback(struct notifier_block *nfb, unsigned long action, void *
 static struct notifier_block cpu_notify = {
    .notifier_call = cpu_callback
 };
-
-static void ksm_worker(struct work_struct *w)
-{
-   int ret;
-   VCPU_DEBUG("in ksm_worker(): %s\n", current->comm);
-
-   ret = ksm_init();
-   VCPU_DEBUG("init: %d\n", ret);
-}
+static struct mm_struct *mm;

 int __init ksm_start(void)
 {
-   wq = create_singlethread_workqueue("worker_ksm");
-   if (!wq)
-       return -ENOMEM;
+   int ret = 0;

-   if (!queue_delayed_work(wq, &work, 100)) {
-       destroy_workqueue(wq);
-       return -EINVAL;
-   }
+   VCPU_DEBUG("ksm_start(): stealing %s's CR3 as host's\n", current->comm);
+   mm = current->active_mm;    /* or even ->mm  */
+   atomic_inc(&mm->mm_count);
+   ksm.kernel_cr3 = __readcr3();   /* or even __pa(mm->pgd)  */

-   VCPU_DEBUG_RAW("Done, wait for wq to fire\n");
-   register_hotcpu_notifier(&cpu_notify);
-   return 0;
+   ret = ksm_init();
+   if (ret == 0)
+       register_hotcpu_notifier(&cpu_notify);
+   else
+       mmdrop(mm);
+
+   VCPU_DEBUG("init: %d\n", ret);
+   return ret;
 }

 void __exit ksm_cleanup(void)
 {
+   int r;
    unregister_hotcpu_notifier(&cpu_notify);
-   destroy_workqueue(wq);
-   VCPU_DEBUG("exit: %d\n", ksm_exit());
-   VCPU_DEBUG("Bye\n");
+
+   r = ksm_exit();
+   mmdrop(mm);
+   VCPU_DEBUG("exit: %d\n", r);
 }

 module_init(ksm_start);
zhexwang commented 7 years ago

Your code is correct. I don't know why your machine hung up. Did you check the reason of hung up? I think there is no problem here.