ememos / GiantVM

9 stars 9 forks source link

Fix race condition on "wait_remote_cpu_count" #20

Closed YWHyuk closed 3 years ago

YWHyuk commented 3 years ago

Race condition

There is a race condition between "wake_remote_cpu" caller thread and remtoe vCPU threads. Waker thread wake up remote vCPU threads. Then, it sets the value of "wake_remote_cpu_count" without any lock. So, make the value set before waking up the remote vCPU threads.

Waker Thread

void wake_remote_cpu(void) {
    qemu_cond_broadcast(&qemu_remote_cpu_cond);    /* Wake up remote vCPU thread */
    wait_remote_cpu_count = smp_cpus - local_cpus; /* Race condition!!! */
    while (wait_remote_cpu_count > 0) {
        /*wait all remote cpu thread to change CPU status */
    }
}

Remote vCPU Thread

<Remote vCPU routine>
  qemu_cond_wait(&qemu_remote_cpu_cond, &qemu_remote_mutex); /* Wait for condition */
  wait_remote_cpu_count--;                                   /* Race condition!!! */

This may be the reason for the qemu process not responding after the Guest OS shutdown.

Unlock twice

Also, remote vCPU thread call "qemu_mutex_unlock_iothread" twice. For pthreads it will result in undefined behaviour. So Fix it by adding lock function.

<Remote vCPU routine>
   else {
        printf("CPU %d is remote CPU, pause\n", cpu->cpu_index);
        qemu_mutex_unlock_iothread(); <---------------------------- (1)
        /* use a new lock since qemu_global_mutex may cause deadlock */
        qemu_mutex_lock(&qemu_remote_mutex);
        while (1) {
            qemu_cond_wait(&qemu_remote_cpu_cond, &qemu_remote_mutex);
            wait_remote_cpu_count--;
            if (qemu_shutdown_requested_get()) {
                cpu->stopped = true;
                break;
            }
            if (qemu_reset_requested_get()) {
                cpu->stopped = true;
            }
        }
        qemu_mutex_unlock(&qemu_remote_mutex);
    }

    qemu_kvm_destroy_vcpu(cpu);
    cpu->created = false;
    qemu_cond_signal(&qemu_cpu_cond);
    qemu_mutex_unlock_iothread(); <---------------------------- (2)
    return NULL;
}

Let me know if there's anything I missed 😉 Thanks!

solemnify commented 3 years ago

Great work!! We will review soon..

YWHyuk commented 3 years ago

@ememos

I thought we are in the KVM mode. So, I considered tcg_enabled() as false without much thought. Is that wrong? --> QEMU can directly execute guest code using KVM, but it can also operate as an emulator without KVM. Therefore, it is expected that there will be no side effect in the general case if possible.

Okay, I see what you're worried about. I'm really sorry for bothering you and this will be a last question.

I touched qemu_kvm_cpu_thread_fn function and this is called by qemu_kvm_start_vcpu function. Also qemu_kvm_start_vcpu is called when kvm_enabled().

void qemu_init_vcpu(CPUState *cpu)
{
    cpu->nr_cores = smp_cores;
    cpu->nr_threads = smp_threads;
    cpu->stopped = true;

    if (!cpu->as) {
        /* If the target cpu hasn't set up any address spaces itself,
         * give it the default one.
         */
        AddressSpace *as = address_space_init_shareable(cpu->memory,
                                                        "cpu-memory");
        cpu->num_ases = 1;
        cpu_address_space_init(cpu, as, 0);
    }

    if (kvm_enabled()) {
        qemu_kvm_start_vcpu(cpu);
    } else if (tcg_enabled()) {
        qemu_tcg_init_vcpu(cpu);
    } else {
        qemu_dummy_start_vcpu(cpu);
    }
}

So, my question is "Should I still consider tcg even if the part I modified is dependent on kvm?"

ememos commented 3 years ago

Ok. You're right. Your modifications are limited to KVM, so let's proceed with the original proposal you suggested.