apache / mynewt-mcumgr

Apache mynewt
https://mynewt.apache.org/
99 stars 76 forks source link

fs_mgmt: Fix file download for Zephyr #65

Closed de-nordic closed 4 years ago

de-nordic commented 4 years ago

File download worked only for files small enough to fit into small mcumgr buffer, dedicated for CBOR encoding. Because file read buffer size is independent from the size of CBOR encoding buffer and part of CBOR buffer is used outside of control of download backend, it could happen that the chunk read could not fit into actual free space left within CBOR buffer, automatically failing the download.

Signed-off-by: Dominik Ermel dominik.ermel@nordicsemi.no

de-nordic commented 4 years ago

Zephyrproject-rtos/mcumgr fork PR review: https://github.com/zephyrproject-rtos/mcumgr/pull/14

Zephyr "umbrella" PR that updates west and allows to test changes on Zephyr: https://github.com/zephyrproject-rtos/zephyr/pull/23011

ccollins476ad commented 4 years ago

@de-nordic could you please explain the issue in a bit more detail? Would setting FS_MGMT_DL_CHUNK_SIZE to a suitable value solve the problem? Also, is this problem somehow Zephyr-specific?

Regarding the new tinycbor functions to measure the size of encoded data: the log management command does something similar without requiring a tinycbor update. It uses a "count encoder": https://github.com/apache/mynewt-mcumgr/blob/master/cmd/log_mgmt/src/log_mgmt.c#L197-L238. I didn't totally understand the issue in this PR, so I am not sure if that helps or not.

vrahane commented 4 years ago

@de-nordic I agree with what @ccollins476ad is saying. We do have a count encoder which can be used to pre-determine the size.

de-nordic commented 4 years ago

@de-nordic could you please explain the issue in a bit more detail? Would setting FS_MGMT_DL_CHUNK_SIZE to a suitable value solve the problem? Also, is this problem somehow Zephyr-specific?

Setting FS_MGMT_DL_CHUNK_SIZE to suitable size would also solve the problem, we would just waste some free bytes of the buffer. The buffer that is reserved for CBOR encoding, when it gets into fs_mgmt_file_download, is already shortened by TX header, which is smp specific, then 9 bytes of CBOR data that encodes beginning of the CBOR map (infinite length map); there needs to be 1 byte reserved for the termination of the map (at the end), which happens outside of fs_mgmt_file_download. Inside the function we encode some strings "off", "rc", "data", "len" which will have constant sizes, they are followed (each one separately) by data hey represent in case of "off" and "len" these have variable length: up 5 bytes per each , if they contain numbers that require 32 bit representation. "len" is only encoded within first packet. "data" pair is the one where we store data read and this would be FS_MGMT_DL_CHUNK_SIZE.

The problem has been detected on Zephyr. Basically what happens is that when you try to download file that is bigger than the free space (the one that is now calculated) the chunk read (of FS_MGMT_DL_CHUNK_SIZE) will not fit into what free space there is left in buffer.

Regarding the new tinycbor functions to measure the size of encoded data: the log management command does something similar without requiring a tinycbor update. It uses a "count encoder": https://github.com/apache/mynewt-mcumgr/blob/master/cmd/log_mgmt/src/log_mgmt.c#L197-L238. I didn't totally understand the issue in this PR, so I am not sure if that helps or not.

I did not know about the functions, I will try to use them instead. Thanks for information.

ccollins476ad commented 4 years ago

Others have already approved this and I don't want to be a nuisance, but I still don't understand why this fix is Zephyr-specific.

I also don't understand the need for this fix. Is this just to fit as much file data as possible into every response after the first? There is only a small difference between the first response and all subsequent responses: a three-character string ("len") and a variable-length integer containing the file size. So we're talking about six or so bytes. Is it that important to utilize those six bytes in all the subsequent responses?

de-nordic commented 4 years ago

Others have already approved this and I don't want to be a nuisance, but I still don't understand why this fix is Zephyr-specific.

I just did not have time to get myself informed how does cbor encoder on mynewt works and if there is the same problem.

I also don't understand the need for this fix. Is this just to fit as much file data as possible into every response after the first? There is only a small difference between the first response and all subsequent responses: a three-character string ("len") and a variable-length integer containing the file size. So we're talking about six or so bytes. Is it that important to utilize those six bytes in all the subsequent responses?

We do the full calculation once for every byte transferred, at offset 0. So it happens once. I loose some bytes anyway, because I reserve offset size enough to fit length and do not recalculate it afterwards; through the transfer of all chunks of file, the same, calculated at the beginning value is used.

The part that encodes data is also variable length, because it consists of pair ("data", chunk); while "data" is constant length, the chunk is pair itself (chunk_size, chunk_byte_string), where chunk_size is encoded by cbor as int and falls under the same rules as all other integer numbers encoded with cbor, and chunk_byte_string is just string of bytes

The calculation happens once per file transfer. I think this is better then calculating the value as preprocessor constant.

ccollins476ad commented 4 years ago

The calculation happens once per file transfer. I think this is better then calculating the value as preprocessor constant.

But this PR does use a constant (MCUMGR_BUF_SIZE). It's just a different constant.

Different connections may have different MTUs (e.g., in Bluetooth). Dynamically calculating the maximum payload size using the connection's MTU would certainly be an improvement, but that is not what this PR does.

The issues I have with this PR are:

  1. Creates a divergence in behavior among Zephyr and other OSes.
  2. We could achieve nearly the same effect with no code changes; just adjust FS_MGMT_DL_CHUNK_SIZE.

I don't think there is much else to say. Others should not wait for my approval before deciding whether to merge.

de-nordic commented 4 years ago

Different connections may have different MTUs (e.g., in Bluetooth). Dynamically calculating the maximum payload size using the connection's MTU would certainly be an improvement, but that is not what this PR does.

OK, well-founded criticism of the solution.

The issues I have with this PR are:

1. Creates a divergence in behavior among Zephyr and other OSes.

2. We could achieve nearly the same effect with no code changes; just adjust `FS_MGMT_DL_CHUNK_SIZE`.
  1. OK, I do not want to insist on doing that.

  2. I will try to work such solution then. Do you have any tips?

    I don't think there is much else to say. Others should not wait for my approval before deciding whether to merge.

I do not mind well-grounded criticism, I think others will understand the delay If we could work out better solution. I want it fixed. My way might not be the right way.

sjanc commented 4 years ago

should this be closed now?

de-nordic commented 4 years ago

should this be closed now?

Yes, this has been abandoned in favour of the less invasive approach https://github.com/apache/mynewt-mcumgr/pull/72