Closed Freax13 closed 1 month ago
I rebased onto main to resolve some conflicts.
Taking a closer look, I don't understand why these changes are needed at all. The address held by
HV_DOORBELL_ADDR
simply indicates where in thePerCpu
structure the pointer to the #HV doorbell page is located. It is not a promise that the current CPU has a doorbell page. In fact it cannot, because in a multi-processor configuration, the APs start running in an environment in whichHV_DOORBELL_ADDR
is non-null but the local doorbell page has not yet been set up. For this reason, the #HV handling code looks not only atHV_DOORBELL_ADDR
to decide whether to process, but also at whether the pointer fetched from that address is non-null. Consequently, I see no fault with the existing code, and no reason to change it.
Previously HV_DOORBELL_ADDR
wasn't properly initialized, it was always 0
on all CPUs at all times. This was because setup_hv_doorbell
was called after idt_init
, so idt_init
couldn't properly setup HV_DOORBELL_ADDR
.
Previously
HV_DOORBELL_ADDR
wasn't properly initialized, it was always0
on all CPUs at all times. This was becausesetup_hv_doorbell
was called afteridt_init
, soidt_init
couldn't properly setupHV_DOORBELL_ADDR
.
I missed that before. In that case, a change is definitely needed. However, I think we still have a subtle issue with the sequence. Once the #HV doorbell page is registered with the hypervisor, it is possible to start receiving events, but if the #HV handler is not prepared to consume them, then they will remain pending indefinitely because NoFurtherSignal will remain set and the hypervisor will not send additional #HV notifications to force processing. The best way to handle this will to ensure that HV_DOORBELL_ADDR
is set before the #HV page is registered with the hypervisor. I think that just means moving the call to init_doorbell_addr()
so it comes before the call to this_cpu().configure_hv_doorbell()
.
Previously
HV_DOORBELL_ADDR
wasn't properly initialized, it was always0
on all CPUs at all times. This was becausesetup_hv_doorbell
was called afteridt_init
, soidt_init
couldn't properly setupHV_DOORBELL_ADDR
.I missed that before. In that case, a change is definitely needed. However, I think we still have a subtle issue with the sequence. Once the #HV doorbell page is registered with the hypervisor, it is possible to start receiving events, but if the #HV handler is not prepared to consume them, then they will remain pending indefinitely because NoFurtherSignal will remain set and the hypervisor will not send additional #HV notifications to force processing. The best way to handle this will to ensure that
HV_DOORBELL_ADDR
is set before the #HV page is registered with the hypervisor. I think that just means moving the call toinit_doorbell_addr()
so it comes before the call tothis_cpu().configure_hv_doorbell()
.
Yeah, that makes sense. I restored the previous call graph and changed PerCpu::hv_doorbell_addr
to return the address to the hv_doorbell
field even if it hasn't been initialized yet. For that to work, I had to change the OnceCell<&'static HVDoorbell>
to Cell<Option<&'static HVDoorbell>>
because OnceCell
doesn't have memory layout guarantees, and Cell
does.
This PR contains some fixes for the #HV handler.
There's still one remaining problem which is that the
IF
flag is never ever enabled (except in some tests) and so some parts of the #HV handler are effectively dead code. The reason for that is thatirqs_disable
remembers the state of theIF
when it's called, and if theIF
flag is not set,irqs_enable
will not setIF
. This doesn't seem like intended behavior, but I'm unsure where the best place to set theIF
flag would be.Cc @msft-jlange Could you please test these changes in your environment that has restricted injection support?