Closed hoglet67 closed 2 years ago
Here's a trace of tube detection working (without switching Co Pros):
D0 - Phi2 D1 - nTUBE (the trigger) D2 - RnW D3 - D0 D6 - GPU code telltale D7 - tube_io_handler() telltale
You are seeing the results of this fragment of code:
.LE375
LDX #&01
STX LFEE0
LDA LFEE0
EOR #&01
LDX #&81
STX LFEE0
AND LFEE0
ROR A
RTS
And here's the failure:
The second completion of tube_io_handler() completes in an unusually short time.
It looks like the problem is TUBE_ENABLE_BIT in tube_irq is somehow getting unexpected cleared.
That's why tube_io_handler completes in an unexpected short time.
Just need to work out why now....
This is the reason: https://github.com/hoglet67/PiTubeDirect/blob/hognose-dev/src/tube-client.c#L77
This code predates the addition of the TUBE_ENABLE_BIT to tube_irq flags (to support the Null Co Pro)
When changing Co Processors initially the tube is (wrongly) disabled. This happens as soon as BREAK is depressed.
It's only re-enabled at the end of wait_for_rst_release() and because of the debounce delay, this is actually about 1ms after nRST has gone high. Maybe longer, I need to check....
This blocks writes to the tube registers.
Changing it to the following resolves the issue:
// Clear all old interrupts, and set tube_enable appropriately
if (copro_def->type == TYPE_HIDDEN) {
tube_irq = 0;
} else {
tube_irq = TUBE_ENABLE_BIT;
}
(but possibly we also have and issue when switching to the null co processor)
Working reliably now.... ;-)
Just for the record, my Master booted to MODE 134 which slowed it down enough not to have this problem.
We are seeing this with the latest hognose dev: 110708dd
Several things need to be in place to trigger the bug:
Initially I thought it was the same as #141, where the host write to &FEE0 was delayed.
However, the below ICE trace indicates it is different.
This is what you see with the ICE:
The second read of &FEE0 should see 4F, so the write is delayed..
However, a manual read of FEE0 in a breakpoint much later still sees 4E, so the write was actually lost, not just delayed. This makes it different to #141.
What might caused a lost tube message?
If the tube code running on GPU core 1 is somehow blocked by the firmware blob running on GPU core 0. But then I would expect the read of &FEE0 to return garbage.
If the doorbell message is overwritten by a subsequent message before it has time to be read. I think that might be what's happening here. That could happen if the ARM FIQ interrupts were blocked. Or if the ARM FIQ handler was completely evicted from cache.
Only the following tube messages are seen by the ARM
So in the above tube detection code, only the two writes are seen by the ARM, and somehow the second write is getting lost!
A few more things I tried:
One way to debug this is to add "telltales" to the GPU code and to the tube_io_handler() code. Then just look at the timings.