AdaCore / bb-runtimes

Source repository for the GNAT Bare Metal BSPs
Other
65 stars 51 forks source link

Tasks on second core of the zynq7000 seems to be unable to wake up #32

Closed BottCode closed 4 years ago

BottCode commented 4 years ago

I'm working on zynq7000 ravenscar-full runtime with the dual-core support and I think I found a bug. Not sure about this, maybe I'm wrong. Here you can get an application reproducing the issue.

As you can see in this spec, there're two tasks (in addition to the main): the first one is allocated to CPU 1, while the second one to CPU 2.

Task A, executing on CPU 1, works correctly: every 500ms it is scheduled on that CPU. Task B, executing on CPU 2, doesn't seem to behave the same way. It should be scheduled on CPU 2 every 5000ms, but the following seems to happen:

  1. B is scheduled correctly at its first time;
  2. when delay until Next Period; is "hitten", then B go to sleep, which is correct;
  3. the idle task, i.e. the one which executes when no other tasks are in the ready queue, get the CPU 2, which is correct;
  4. after (roughly) 5 seconds, the B's alarm should ring... but this seems not to happen. So the conseguence is that the idle task will hold forever the CPU 2 and B is never awakened.

How can you reproduce the issue?

Build the app

Clone this repository, then modify the ravenscar-full-zynq7000 runtime in order to support dual-core and install it. I've reproduced this issue on GNAT 2018-arm-elf and 2019-arm-elf. Then build the application with gprbuild demo.gpr (you can also use the Makefile).

Executing on the target

Unfortunately, there's no an emulator supporting the dual-core version of the zynq7000, so you have to reproduce this issue on the real target. I'm using xsdb 2016.4, the debugger provided by Xilinx SDK, but maybe you can use GDB.

  1. Run xsdb in obj folder
  2. execute source cora_xsdb.ini and the application should start.

I've set a breakpoint on B'sdelay until.

You should notice that execution hit that breakpoint, but if you resume the execution (with con command) before the next job release (i.e. before than 5000ms), than that breakpoint will never be hitten. Conversely, if you resume the execution after the next job release, you will hit that breakpoint another time. For example: if you wait 11000ms to resume the execution, you will hit that breakpoint another two times (11000 / 5000 = 2 job releases).

Do you observe the same behaviour?

P.S. main.s is generated in this way: arm-eabi-objdump -D main > main.s.

Fabien-Chouteau commented 4 years ago

Hello @BottCode,

Thank you for the report, unfortunately as mentioned in #30, we do not support multiprocessor on the zynq7000 run-times and we do not plan to support it in the near future. So we won't investigate further.

However, I can give you some tips if you want to look into this:

Regards,

BottCode commented 4 years ago

Hi @Fabien-Chouteau, thank you for the answer and for the tips.

Yes, it seems that no alarm handler is installed for CPU 2. Maybe even global timer interrupt is not propagated to the CPU 2. I'll check it.

Anyway, for my project, I need to use CPU 2. So I think I will add multicore support on zynq7000 runtime, at least as far as having a working scheduling also on CPU 2. If I get some troubles, I'll ask you for support, if you agree.

BottCode commented 4 years ago

Hi @Fabien-Chouteau, I understood what the problem was and I've just applied a simple (and temporary) working solution. It's all about an initialization and configuration board/CPUs problem.

In s-bbthre.adb there is the following procedure:

   procedure Initialize_Slave
      (Idle_Thread   : Thread_Id;
      Idle_Priority : Integer;
      Stack_Address : System.Address;
      Stack_Size    : System.Storage_Elements.Storage_Offset)
   is
      CPU_Id : constant System.Multiprocessors.CPU := Current_CPU;

   begin
      Initialize_Thread
         (Idle_Thread, Null_Address, Null_Address,
         Idle_Priority, CPU_Id,
         Stack_Address + Stack_Size, Stack_Address);

      Queues.Running_Thread_Table (CPU_Id) := Idle_Thread;
   end Initialize_Slave;

This procedure is executed by the CPU 2 and, as you can see, there's no HW initialization, just the idle task initialization and its insertion in the running thread table. A proper HW initialization is performed by procedure Initialize (contained in the same file), but it is executed only by CPU 1. This latest procedure sets some register concerning (but not only) the IRQ (ICCICR) and who are the CPUs to which the interrupts should be sent. This implies that only CPU 1 is informed about the timer interrupt and in fact __gnat_irq_handler is never called on CPU 2. I've check this, with XSDB, setting a breakpoint on it.

So, I modified (maybe too simplistically) procedure Initialize_Slave in this way:

   procedure Initialize_Slave
      (Idle_Thread   : Thread_Id;
      Idle_Priority : Integer;
      Stack_Address : System.Address;
      Stack_Size    : System.Storage_Elements.Storage_Offset)
   is
      CPU_Id : constant System.Multiprocessors.CPU := Current_CPU;

   begin
      Board_Support.Initialize_Board; -- contained in s-bbbosu.adb
      Initialize_Timers;              -- contained in s-bbthre.adb

      Initialize_Thread
         (Idle_Thread, Null_Address, Null_Address,
         Idle_Priority, CPU_Id,
         Stack_Address + Stack_Size, Stack_Address);

      Queues.Running_Thread_Table (CPU_Id) := Idle_Thread;
   end Initialize_Slave;

Now the timer interrupts is forwarded also to CPU 2 and tasks scheduling on CPU 2 seems to work correctly.

What do you think about that? As soon as I can:

  1. I'll try to apply a more "elegant" solution;
  2. if it works, I'll submit it through a PR.
Fabien-Chouteau commented 4 years ago

s-bbthre.adb is a cross target package that should not be modified for this.

If the problem is lack of hardware initialization on the second CPU, you should have a look at what is done the the zynqmp. There is a Initialize_CPU_Devices function in https://github.com/AdaCore/bb-runtimes/blob/a535756e974df7ee3b19d348b2dc1c877191e9a4/src/s-bbbosu__armv8a.adb#L290 that is exported to be called from the crt0 only on secondary CPUs: https://github.com/AdaCore/bb-runtimes/blob/a535756e974df7ee3b19d348b2dc1c877191e9a4/aarch64/zynqmp/start.S#L309