victronenergy / venus

Victron Energy Unix/Linux OS
https://github.com/victronenergy/venus/wiki
588 stars 75 forks source link

mqtt-rpc: improve file transfers #345

Closed mpvader closed 4 years ago

mpvader commented 6 years ago

Before starting this; wait for #393 (lower MTU) is closed and released.

mqtt-rpc transmits that file in multiple chunks, and does not have a retry mechanism, and uses QoS 0 (which is no QoS, just fire once and then forget).

So, getting that file as a user is only possible if both links remains solid/up during the entire upload process:

This can be better:

Note that the file-transfer from Celery to Venus-device does have retries implemented. So its better, but when we start sending files in one message we can remove that code: even better.

mpvader commented 6 years ago

And then mqtt-rpc should use QoS 1 when publishing the file. Doing this, TCP level retries will be used as long as the connection is up, and mqtt retries will take over in case of a connection loss during the file transfer (which is just a single message transfer).

Both from Venus-Device to broker as broker to browser.

I don’t see a downside in using QoS 1 here.

On top of that, Celery and/or Browser can start a clean, though persistent, session when starting the task. Closing it once done.

Correct me if I’m wrong!

mauritsvdvijgh commented 6 years ago

1 large chunk is better, if there is no limit we run into I would prefer that.

To further improve the situation with bad connections we might add a parameter to firmwareupdate and veconfigurewrite to include the file/md5 in the firmwareupdate/veconfigurewrite command. Same goes for veconfigureread, make sure the command that instructs the venus device also uploads the file to mqtt. These actions consist all need 2 commands to complete them (file upload, firmware update. file upload veconfigurewrite. veconfigureread, filedownload)

Is there a way to cancel a message that is begin retried with QoS 1 (I don't think so, not sure)? If not that would be a possible downside. A suggestion for the future would be to use QoS 0 with celery job/task retries. With a view in VRM to manage the jobs/tasks and possibly collect results of the jobs/tasks (like the veconf file in the case of veconfigure download).

How would we manage a veconfigure download in VRM for an installation with a bad internet connection? Even if we use QoS one we still need to make sure to subscribe to the right path and receive the file in the VRM frontend to have the file be downloaded to the computer of the user.

If a user leaves the page while the message is still supposed to be delivered then the veconfigure download is still going to go through but will not be received on the computer of the user.

elnino-ict commented 5 years ago

This issue is on hold for now until MTU has been changed.

wiebeytec commented 5 years ago

Even though it's on-hold, I would like to comment on QoS 1 for the future: I do see a few issues. One was already mentioned by Maurits, and that is that you can't cancel it. Even if the installation of offline for a long time, the message will be received when it connects.

Another issue is that you can't specify a different setting for clean session for connects vs reconnects, and therefore you need to use cleansession: false. This will have side effects.

I don't recall the full specifics, but I think caution is indicated.

Perhaps MQTTv5 will help though.

mauritsvdvijgh commented 5 years ago

Proposal: Make device-update, vebusremoteconfig read/vebusremoteconfig write take a base64 encoded file as input parameter

Meaning no more seperate file upload and then call device-update or vebusremoteconfig read and no more seperate file download and then call vebusremoteconfig write. This makes these actions atomic (as they are now 1 command instead of 2 commands in quick succession). This also does away with the chunked file upload.

why?

mauritsvdvijgh commented 5 years ago

How to implement this:

with single command firmware update and vebusremoteconfig write

  1. Change file download to a "one shot" command to take an additional "file" parameter containing the file as a base64 encoded string (backwards incompatible). Consequence of this is that we cannot skip the uploading of the file if it already is on the device (currently it returns "file exists" before the actual chunk uploading starts)
  2. Change file upload to return the complete file in one go (backwards incompatible)
  3. Add a command "file-exists" which takes an md5 hash and returns if the file is already on the device.
  4. Add to the commands that use a file: device-update and vebusremoteconfig write an optional parameter "file" that can be used to pass the file as a base64 encoded string. If the file parameter is not given these commands will proceed as they currently do by looking up the file on the filesystem by the md5 hash.

Here everything can be done in 1 command except vebusremoteconfig read which would still require calling file download after it finished (but the file download command itself would be 1 request/1 response with the complete file). The option to check if a file exists can be used to intelligently skip file uploads if needed (not needed for local use cases, helpful for online/remote use cases where might deal with a bad internet connection).

Pros:

Cons:

without single command firmware update and vebusremoteconfig write

  1. Change file download to a "one shot" command to take an additional "file" parameter containing the file as a base64 encoded string (backwards incompatible). Consequence of this is that we cannot skip the uploading of the file if it already is on the device (currently it returns "file exists" before the actual chunk uploading starts)
  2. Change file upload to return the complete file in one go (backwards incompatible)
  3. Add a command "file-exists" which takes an md5 hash and returns if the file is already on the device.

Pros:

Cons:

mauritsvdvijgh commented 5 years ago

https://github.com/victronenergy/mqtt-rpc/commit/90f85b2f0118e8564167e3db3c4614b297e470cb This has been implemented. Closed it, then I remembered that you close issues when the relevant changes are included in Venus OS.

mpvader commented 5 years ago

@thlassche lets also put out an egg on this one this week.

mpvader commented 5 years ago

Izak: please make sure that its in; and then close it. thanks

izak commented 4 years ago

The one-chunk file upload was in mqtt-rpc v1.28, which was released in Venus 2.30 already. Closing this.