Hurd_SMP project

Objective

The objective of this project is to fix and complete SMP support (multiprocessing) in GNU/Hurd. This support must be implemented in GNU/Hurd's microkernel (aka GNU Mach)

Original status:

GNU/Hurd includes a tiny SMP support, as this FAQ explain.
The GNU Mach source code includes many special cases for multiprocessor, controlled by #if NCPUS > 1 macro.

But this support is very limited:

GNU Mach don't detect CPUs in runtime: The number of CPUs must be hardcoded in compilation time.
The number of cpus is set in mach_ncpus configuration variable, set to 1 by default, in configfrag.ac file. This variable will generate NCPUS macro, used by gnumach to control the special cases for multiprocessor.
If NCPUS > 1, gnumach will enable multiprocessor support, with the number of cpus set by the user in mach_ncpus variable. In other case, this support will be unabled.
The special cases to multicore in gnumach source code have never been tested, so these can contain many errors. Furthermore, these special case are incomplete: many functions, as cpu_number() or intel_startCPU() aren't written.
GNU Mach doesn't initialize the processor with the properly options to multiprocessing. By this reason, the current support is only multithread, not real multiprocessor

Added to this, there are other problem: Many drivers included in Hurd aren't thread-safe, and these could crash in a SMP environment. So, It's necessary to isolate this drivers, to avoid concurrency problems

Solution

To solve this, we need to implement some routines to detect the number of processors, assign an identifier to each processor, and configure the lapic and IPI support. These routines must been executed during Mach boot.

"Really, all the support you want to get from the hardware is just getting the number of processors, initializing them, and support for interprocessor interrupts (IPI) for signaling." - Samuel Thibault link

"The process scheduler probably already has the support. What is missing is the hardware driver for SMP: enumeration and initialization." - Samuel Thibault link

The current necessary functions are cpu_number() (in kern/cpu_number.h) and intel_startCPU(). Another not-implemented function, but don't critical, is cpu_control() Reference

Other interesting files are pmap.c and sched_prim.c

Added to this, we have to build an isolated environment to execute the non-thread-safe drivers.

"Yes, this is a real concern. For the Linux drivers, the long-term goal is to move them to userland anyway. For Mach drivers, quite often they are not performance-sensitive, so big locks would be enough." - Samuel Thibault link

Project draft

You can read the full project draft in Hurd SMP Project draft

How to test

To test the software you will need:

Debian GNU/Hurd installation: The Debian GNU/Hurd installer is pretty similar to a standard Debian installer.
- You can follow this guide to learn how to install Debian GNU/Hurd
  Install Debian GNU/Hurd in real hardware
- If you prefer to use a virtual machine as Qemu, you can use this script: qemu_hurd script.
  - Also, you can install It in VirtualBox!! ;)
Compile the sources: From Debian GNU/Hurd, follow this steps:
1. Clone the repository:
  
  git clone https://github.com/AlmuHS/GNUMach_SMP
2. Install the dependencies
```
apt-get install build-essential fakeroot
apt-get build-dep gnumach
apt-get install mig
```
3. Configure preliminary steps
  
  cd GNUMach_SMP autoreconf --install
  
  create build directory
  
  mkdir build cd build
  
  ../configure --prefix=
4. Compile!!
  
  make gnumach.gz
  1. Copy the new image to /boot directory (as root)
    
    cp gnumach.gz /boot/gnumach-smp.gz
  2. Update grub (as root)
    
    update-grub
  3. Reboot
    
    reboot
  After reboot, you must to select gnumach-smp.gz in GRUB menu

More info in: https://www.gnu.org/software/hurd/microkernel/mach/gnumach/building.html

Task done

Recovered and updated old APIC headers from Mach 4
Modified configfrag.ac
- Now, if mach_ncpus > 1, NCPUS will be set to 255
Integrated cpu detection and enumeration from acpi tables
Solved memory mapping for *lapic. Now It's possible to read the Local APIC of the current processsor.
Implemented cpu_number() function
Solved ioapic enumeration: changed linked list to array
Initialized master_cpu variable to 0
Initialized ktss for master_cpu
Enabled cpus using StartUp IPI, and switched them to protected mode
- Loaded temporary GDT and IDT
Implemented assembly CPU_NUMBER()
Refactorized cpu_number() with a more efficient implementation
Added interrupt stack to cpus
Improve memory reserve to cpu stack, using Mach style (similar to interrupt stack)
Enabled paging in AP processors
Loaded final GDT and IDT
Added cpus to scheduler

Current status

In the Min_SMP test environment, the cpus are detected and started correctly
- I need to implement APIC configuration
In gnumach, the number of cpus and its lapic structures are detected and enumerated correctly
ioapic enumeration feels to work correctly
- Mach use PIC 8259 controller, so ioapic is not necessary. Migrate Mach to ioapic is a future TODO
gnumach enable all cpus during the boot successfully
The cpus are added successfully to the kernel
gnumach boots with 2 cpu
- It fails with more than 2 cpu, and with a only cpu. TODO: fix It
Some Hurd servers fails
- DHCP client crash during the boot
- Login screen don't receive keyboard touch

Implementation

Summary

The cpu detection and enumeration are implemented in acpi_rdsp.c and acpi_rdsp.h.
- The main function acpi_setup() is called from model_dep.c
- This function generates some structures:
  - *lapic: pointer to the local apic of the current processor. Store the registers of the local apic.
  - ncpu: variable which store the number of cpus
- The apic_id is stored in machine_slot
The APIC structures, recovered from old gnumach code, are stored in apic.h
cpu_number() C implementation was added to kern/cpu_number().
The CPU_NUMBER() assembly implementation was added to i386/i386/cpu_number.h
Function start_other_cpus() was modified, to change NCPUS macro to ncpu variable
The memory mapping is implemented in vm_map_physical.c and vm_map_physical.h
- The lapic mapping is in extra_setup()
- This call require that pagging is configured, so the call is added in kern/startup.c, after pagging configuration
The cpus enabling is implemented in mp_desc.c
- The routine to switch the cpus to protected mode is cpuboot.S
cpu_number() has been refactorized, replacing the while loop with the array apic2kernel[], indexed by apic_id
CPU_NUMBER() assembly function has been implemented using apic2kernel[] array
Added call to interrupt_stack_alloc() before mp_desc_init()
Added paging configuration in cpuboot.S
Added calls to gdt_init() and idt_init() before call to slave_main(), to load final GDT and IDT.
Enabled call to slave_main(), to add AP processors to the kernel
Moved paging configuration to paging_setup() function
Solved little problem with AP stack: now each AP has their own stack

Recover old gnumach APIC headers

We have recovered the apic.h header, original from Mach 4, with Local APIC and IOAPIC structs, and an old implementation of cpu_number().

cpu_number() C implementation was moved to kern/cpu_number.c, and the assembly CPU_NUMBER() implementation was moved to i386/i386/cpu_number.h
struct ApicLocalUnit was updated to the latest Local APIC fields, and stored in imps/apic.h

CPU detection and enumeration

In this step, we find the Local APIC and IOAPIC registers in the ACPI tables, and enumerate them.

The implementation of this step is based in Min_SMP acpi.c implementation. The main function is acpi_setup(), who call to other functions to go across ACPI tables.

To adapt the code to gnumach, It was necessary some changes:

Copy and rename files

The acpi.c and acpi.h files were renamed to acpi_rsdp.c and acpi_rsdp.h

These files were copied in i386/i386at/.. directory
Change headers and move variables

The #include headers must be changed to the gnumach equivalent. Some variables declared in cpu.c were moved to acpi_rsdp.c or other files:
- The number of cpus, ncpu, was moved to acpi_rsdp.c
- The lapic ID, stored in cpus[] array, was added to machine_slot[NCPUS] array, and the cpus[] array was removed.
- The lapic pointer extern declaration was added to kern/machine.h
- The struct list ioapics was changed to ioapics[16] array, in acpi_rsdp.c
- struct ioapic was moved to imps/apic.h
Replace physical address with logical address

The most important modification is to replace the physical address with the equivalent logical address. To ease this task, this function is called before configure pagging.

The memory address below 0xc0000000 are mapped directly by the kernel, and their logical address can be got using the macro phystokv(address). This way is used to get the logical address of ACPI tables pointers.

But the lapic pointer is sitted in a high memory position, up to 0xf0000000, so It must be mapped manually. To map this address, we need to use pagging, which is not configured yet. To solve this, we split the process in two steps:
- In APIC enumeration step, we store the lapic address in a temporary variable: lapic_addr
- After pagging is configured, we call to function extra_setup() which reserve the memory address to the lapic pointer and initialize the real pointer, *lapic.

Implementation of `cpu_number()` function

Once get the lapic pointer, we could use this pointer to access to the Local APIC of the current processor. Using this, we have implemented cpu_number() function, which search in machine_slot[] array the apic_id of the current processor, and return the index as kernel ID.

A newer implementation get the Kernel ID from the apic2kernel[] array, using the apic_id as index.

This function will be used later to get the cpu currently working.

CPU enabling using StartUp IPI

In this step, we enable the cpus using the StartUp IPI. To do this, we need to write the ICR register in the Local APIC of the processor who raise the IPI (in this case, the BSP raise the IPI to each processor).

To implement this step, we have been inspired in Min_SMP mp.c and cpu.c files, and based in the existent work in i386/i386/mp_desc.c

We have split this task in some steps:

Modify start_other_cpus()

The start_other_cpus() function calls to cpu_start(cpu) for each cpu, to enable It. We have modified this function to change the NCPUS macro to ncpu variable, reserve memory to the cpu stack, and initialize the machine_slot[] to indicate cpu is unabled.

Furthermore, we have added some printf to show the number of cpus and the kernel ID of current cpu.
- Reserve memory for cpu stack
  
  To implement this step, we token the interrupt stack code as base, using the function interrupt_stack_alloc() .
We have added two new arrays, to store the pointer to the stack of each cpu.
- cpu_stack[] store the pointer to the stack
- _cpu_stack_top[] store the address of stack top
All stack use a single memory reserve. In this way, we only reserve a single memory block, which will be splited to each cpu stack. To reserve the memory, we call to init_alloc_aligned(), which reserve memory from the BIOS area. This function return the initial address of the memory block, which is stored in stack_start.

All stack have the same size, which is stored in STACK_SIZE macro.

Once reserved the memory, we assing the slides to each cpu using stack_start as base address. In each step, we assign stack_start to cpu_stack[cpu], stack_start+STACK_SIZE to _cpu_stack_top[cpu], and increase stack_size with STACK_SIZE

To ease the stack loading to each cpu, we have added a unique stack pointer, called stack_ptr. Before enable each cpu, this pointer is updated to the cpu_stack of the current cpu. This pointer will be used in the cpuboot.S assembly routine to load the stack in the current cpu.
Complete intel_startCPU()

The intel_startCPU() function has the purpose of enable the cpu indicated by parameter, calling to startup_cpu() to raise the Startup IPI, and check if the cpu has been enabled correctly.

To write this function, we have based in XNU's intel_startCPU() function, replacing its calls to the gnumach equivalent, and removing garbage code blocks.
Raise Startup IPI and initialize cpu

gnumach doesn't include any function to raise the Startup IPI, so we have implemented this functions based in Min_SMP cpu.c and mp.cfunctions:
- startup_cpu(): This function is called by intel_startCPU() to start the Startup IPI sequence in the cpu.
- send_ipi(): function to write the IPI fields in the ICR register of the current cpu
- cpu_ap_main(): The first function executed by the new cpu after startup. Calls to cpu_setup() and check errors.
- cpu_setup(): Initialize the machine_slot fields of the cpu
This functions has been added to i386/i386/mp_desc.c
Implement assembly routine to switch the cpu to protected mode

After raise Startup IPI to the cpu, the cpu starts in real mode, so we need to add a routine to switch the cpu to protected mode. Because the real mode is 16 bit, we can't use C instructions (32 bit), so this routine must be written in assembly.

This routine load the GDT and IDT registers in the cpu, and call to cpu_ap_main() to initialize the machine_slot of the cpu.

To write the routine, we has taken the Min_SMP boot.S as base, with a few modifications:
- The GDT descriptor are replaced with gnumach GDT descriptor (boot_gdt: and boot_gdt_descr:), taken from boothdr.S. We also copied the register initialization after GDT loading
- The _start routine is unnecessary and has been removed
- The physical address has been replaced with their equivalent logical address, using the same shift used in boothdr.S
- We have removed the hlt instruction after call cpu_ap_main
The final code is stored in i386/i386/cpuboot.S

Add interrupt stack to cpus

To allow cpus execute interrupt handlers, It's needed a interrupt stack. Each cpu has its own interrupt stack.

To get this, we've added a call to interrupt_stack_alloc() to initialize the cpus interrupt stack array before call to mp_desc_init().

This step don't shows any new effect yet.

Enable paging in the cpus (WIP)

Before add the cpus to the kernel, we need to configure paging in them, to allow fully access to the memory.

To enable paging, we need to initialize CR0, CR3 and CR4 registers. in a similar way to this.

This code has been copied in paging_setup() function, in mp_desc.c. The processor, at starts, isn't capable to read the content from a pointer, so we copied the memory address of kernel_page_dir and pdpbase in two temporary integer variables: kernel_page_dir_addr, and pdpbase_addr.

The paging initialization also requires a temporary mapping in some low memory address. We keep the temporary mapping done in BSP processor until all AP will be enabled.

Add AP processors to the kernel

Once paging is enabled, each cpu will can to read its own Local APIC, using the *lapic pointer. It also allows to execute cpu_number() function, which is necessary to execute the slave_main() function to add the cpu to the kernel.

Before call to slave_main(), we need to load the final GDT and IDT, to get the same value than BSP processor, and be able to load correctly the LDT entries.

To do this, we call to gdt_init() and idt_init() in cpu_setup(), just before call to slave_main().

Once final GDT and IDT are loaded, slave_main() finish successfully, and the AP processors are added to the kernel.

Gratitude

Bosco García: Development guidance, original MinSMP developer, explainations and documentation about MultiProcessor architecture, helpful with gnumach development.
Guillermo Bernaldo de Quiros Maraver: Helpful in original development, first full compilation, and find original SMP problem (SMP without APIC support)
Samuel Thibault: Hurd core dev. Clarifications about SMP status in gnumach, helpful with gnumach questions.
Rodrigo V. G.: Helpful with debugging and memory addressing
Damien Zammit: Helpful with IOAPIC, I/O Management and memory mapping

AlmuHS / GNUMach_SMP

readme

Hurd_SMP project

Objective

Original status:

Solution

Project draft

How to test

create build directory

Task done

Current status

Implementation

Summary

Recover old gnumach APIC headers

CPU detection and enumeration

Implementation of `cpu_number()` function

CPU enabling using StartUp IPI

Add interrupt stack to cpus

Enable paging in the cpus (WIP)

Add AP processors to the kernel

Gratitude

References

AlmuHS / GNUMach_SMP

readme

Hurd_SMP project

Objective

Original status:

Solution

Project draft

How to test

create build directory

Task done

Current status

Implementation

Summary

Recover old gnumach APIC headers

CPU detection and enumeration

Implementation of cpu_number() function

CPU enabling using StartUp IPI

Add interrupt stack to cpus

Enable paging in the cpus (WIP)

Add AP processors to the kernel

Gratitude

References

Implementation of `cpu_number()` function