zephyrproject-rtos / zephyr

Primary Git Repository for the Zephyr Project. Zephyr is a new generation, scalable, optimized, secure RTOS for multiple hardware architectures.
https://docs.zephyrproject.org
Apache License 2.0
10.98k stars 6.68k forks source link

stream_flash: stream_flash_erase_page allows to accidently erase stream #67407

Open de-nordic opened 10 months ago

de-nordic commented 10 months ago

Describe the bug According to stream_flash_erase_page description it is erasing page unless it has already been erased in given stream flash context. The function takes context which holds information like current write pointer for stream flash or last erased page and offset of page to erase: https://github.com/zephyrproject-rtos/zephyr/blob/8785438d313fd20310661cc82c8e2327d2891271/include/zephyr/storage/stream_flash.h#L118-L130 The fact that the function allows to select page offset to erase is a problem.

In the concept of stream flash it should be only allowed to append data to the stream, so obviously only the forward pages should erased. The logic withing stream_flash_erase_page only protects against erasing the previously erased page, by a call to stream_flash_erase_page; which means that the function can erase any other page within the stream with no problem, it is not even checked if the page contains any data or not.

Additional problem with the function is that it will refuse (it will do nothing, no error) to erase page that ctx points to be erased, but once you select other page for erase and repeat request to the previous one it will be erased with no problem. For example lets assume sequence:

stream_flash_buffered_write(ctx, ...)
stream_flash_erase_page(ctx, 0)

where stream_flash_buffered_write started writing to stream, the page 0 will be erased prior to write, so call to stream_flash_erase_page, as the page is marked as erased in ctx, but sequence:

stream_flash_erase_page(ctx, offset_of_any_other_page);
stream_flash_erase_page(ctx, 0);

has just destroyed data written by the stream_flash_buffered_write, yet data pointer still points to where previous write ended. This happens due to: https://github.com/zephyrproject-rtos/zephyr/blob/8785438d313fd20310661cc82c8e2327d2891271/subsys/storage/stream/stream_flash.c#L88-L100 where it is enough for previously erased page offset to differ from current, for erase to be performed.

There is a case for using the stream_flash_erase_page, although not in current form: when avoiding lengthy erase on write. It may happen that buffers need to be synced to device, in which case next page may be required to be erase, which takes time; knowing that user may request erase of the next page in most convenient moment. The function stream_flash_erase_page should be deprecated, and replaced, to serve above case, function that would take size of predicted write and would make sure that enough space is erased, by erasing if needed, for the next write of the predicted size. User should not be allowed to erase any arbitrary page within stream.

To Reproduce The problem with stream_flash_erase_page has been found by @katgiadla, while running stream flash tests and can be reproduced by invoking:

zephyr/scripts/twister -T zephyr/tests/subsys/storage/stream/stream_flash/ -p nrf52840dk_nrf52840 --device-testing --device-serial  /dev/ttyACM0 -v --inline-logs

The above tests requires nrf52840dk_nrf52840 but the problem is universal.

The test will fail at the line: https://github.com/zephyrproject-rtos/zephyr/blob/8785438d313fd20310661cc82c8e2327d2891271/tests/subsys/storage/stream/stream_flash/src/main.c#L429 with information that the page has not been erased as expected.

Expected behavior I do not know what is intended behavior of the stream_flash_erase_page, as I have described above that the page may or may not be erased, depending on whether other page has been erased previously.

Impact With limited use of the function outside of the stream flash API, the impact is not huge. With the rather undefined behaviour of the function it is just asking for weird bugs to happen.

Environment (please complete the following information):

henrikbrixandersen commented 10 months ago

@de-nordic Please assign a priority to this.

de-nordic commented 10 months ago

@de-nordic Please assign a priority to this.

low. Currently it is used in two places, but should be avoided and removed.

github-actions[bot] commented 8 months ago

This issue has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this issue will automatically be closed in 14 days. Note, that you can always re-open a closed issue at any time.

github-actions[bot] commented 6 months ago

This issue has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this issue will automatically be closed in 14 days. Note, that you can always re-open a closed issue at any time.

github-actions[bot] commented 4 months ago

This issue has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this issue will automatically be closed in 14 days. Note, that you can always re-open a closed issue at any time.

github-actions[bot] commented 2 months ago

This issue has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this issue will automatically be closed in 14 days. Note, that you can always re-open a closed issue at any time.