datalog: not resilient to updated DAPLink updates that still require some datalogging workarounds

jaustin commented 3 years ago

See https://github.com/microbit-foundation/DAPLink-microbit/issues/119

It would appear that DAPLink versions that are identical apart from their version number yield different behaviour when datalogging.

0255: works properly 0256: HTML header doesn't appear to get written

This means that future DAPLink updates would require corresponding CODAL updates, which is suboptimal.

Instead, we'd like to deliberately increment the I2C protocol version when relevant changes are made and have CODAL key off that rather than the DAPLink version.

I think the safer behaviour is to treat unknown newer versions like the most recent known version

eg if protocol version N is know and has workarounds X,Y,Z, but N+1 is 'unknown' assume X,YZ, should be applied. If only, say, Z is required on N we can address this with a CODAL update. This is based on the assumption that the workarounds are suboptimal but not actively wrong (IE a more complete/correct implementation of the spec will work with the workarounds). If this is wrong, we should discuss further.

microbit-carlos commented 2 years ago

@JohnVidler in the code that applies/removes workarounds based on the DAPLink version, could you make sure that the workarounds applied in the CODAL code for the latest known DAPLink version is still applied for future all DAPLink versions.

So, when there are DAPLink releases that can remove workaround we manually add that to the CODAL code once we know the details.

JohnVidler commented 2 years ago

We need to codify this as a set of workarounds to apply everywhere, and ensure that this is still an issue applicable to the latest releases.

finneyj commented 1 year ago

I've reviewed the various codepaths that related to workarounds / optimizations based on the version of the board we're running on. Here's a definitive list of the current decisions we make, and the data this is based on:

Workaround	Version Data Used	Code Reference
MICROBIT_USB_INTERFACE_ALWAYS_NOP	version.board == 0x9904	https://github.com/lancaster-university/codal-microbit-v2/blob/18b23e44dcb3c65b191592fa9fb5fd5b31a3fc7d/source/MicroBitPowerManager.cpp#L119-L120
MICROBIT_USB_INTERFACE_BUSY_FLAG_SUPPORTED	version.i2c == 0x02	https://github.com/lancaster-university/codal-microbit-v2/blob/a3492c4378be02a0071721dbc5a7164840be5a11/source/MicroBitPowerManager.cpp#L105-L106
MICROBIT_USB_FLASH_SINGLE_PAGE_ERASE_ONLY	ALWAYS ON	https://github.com/lancaster-university/codal-microbit-v2/blob/18b23e44dcb3c65b191592fa9fb5fd5b31a3fc7d/source/MicroBitUSBFlashManager.cpp#L218-L223
MICROBIT_USB_FLASH_USE_NULL_TRANSACTION	version.i2c < 0x02	https://github.com/lancaster-university/codal-microbit-v2/blob/18b23e44dcb3c65b191592fa9fb5fd5b31a3fc7d/source/MicroBitUSBFlashManager.cpp#L218-L223
MICROBIT_USB_FLASH_100MS_AFTER_ERASE	version.board == 0x9905/0x9906	https://github.com/lancaster-university/codal-microbit-v2/blob/18b23e44dcb3c65b191592fa9fb5fd5b31a3fc7d/source/MicroBitUSBFlashManager.cpp#L235-L237

So, it looks like we are:

a) tied to multiple different version values for the various necessary workarounds. b) largely testing for specific version values and applying workarounds.

I suspect we need some discussion to determine if keying all these workarounds off just the I2C version field is wise and complete...

e.g. Taking each workaround in turn:

MICROBIT_USB_INTERFACE_ALWAYS_NOP: This is a silicon workaround because KL27 based interface chips can't wake up from deep sleep on the I2C line without losing data. Hence this is tied to the hardware, not I2C protocolversion. Can we safely identify KL27 based boards from I2C version? Both historic and future?

MICROBIT_USB_FLASH_100MS_AFTER_ERASE: Similarly, NRF52 based interface chips need a little rest after erasing a page, as the time varies depending of flash wear that is not taken into account when the transaction responds. Nor is there an API to determine completion.

MICROBIT_USB_INTERFACE_BUSY_FLAG_SUPPORTED: Is an I2C protocol based decision. We should be able to update this to simply >= 0x02 here. Then add a workaround for specific future versions if anything breaks.

MICROBIT_USB_FLASH_SINGLE_PAGE_ERASE_ONLY: Is a bit of a NOP, as this is currently always enabled anyway. It has minimal performance overhead, and prevents very large erases from looking like an I2C timeout.

MICROBIT_USB_FLASH_USE_NULL_TRANSACTION: Fixed a specific bug in earlier DapLink firmware (now fixed) that didn't reset the output buffer/return values correctly after some transactions... Hence another transaction was always needed to clear the buffer back to a known state.

SO.... reflecting on all the above, apart from MICROBIT_USB_INTERFACE_BUSY_FLAG_SUPPORTED, I think these are items that are either:

properly related to board revision (as they are silicon workarounds). Here we currently WOULD need to update CODAL whenever we released a new board revision. HOWEVER, we also can't simply do a > version.i2c type test either, as we may well change interface chips based on availability etc.
historic bugs that we would not expect to resurface on later versions. These should be safe to leave as "point solution" workarounds

I can't see a path away from the board revision issue. So perhaps maintain the status quo here (which is also not a bad plan given the fearsome complexity of these workarounds!), and have a policy to update this code with every board rev... or segment board.rev into MCU device classes somehow to allow generalization.

Thoughts based on DapLink direction of travel @microbit-carlos @jaustin ?

lancaster-university / codal-microbit-v2

datalog: not resilient to updated DAPLink updates that still require some datalogging workarounds #144