clearlinux / mixer-tools

Software update mixer and related tools
Apache License 2.0
27 stars 37 forks source link

bsdiff logging: distinguish between command timeout and error #759

Closed phmccarty closed 4 years ago

phmccarty commented 4 years ago

In the bsdiff logging output, it would be helpful for log messages to distinguish between (a) bsdiff commands that exit with an error (see exit code table below), and (b) bsdiff commands that time out after the fixed 8-minute window. At the moment, both of these conditions log the same error message (Failed to create delta ...).

Here is an easy reference to bsdiff exit codes:

Code Meaning
0 Success. A delta is created.
1 Created a FULLDL file instead of a delta. The file was (historically) used as a hint to swupd that a "fullfile download" was needed instead of a delta download. In modern times, we delete the file, since swupd knows that the absence of a delta provides the same hint :-)
> 1 Fatal error. At the moment, the only possible value is 255 (returning -1 in the program). Other error codes > 1 but < 255 are reserved for other types of fatal errors in the future.
reaganlo commented 4 years ago

@phmccarty By looking at the code to create deltas, I think there are few more issues here. I've tried to explain them below and provide a suggestion. Please let me know your thoughts: @otaviobp FYI

Current Behavior:   Errors Write to bsdiffLog Remove delta file Return error
1 bsdiff FULLDL No Yes Yes
2 bsdiff Timeout Yes Yes Yes
3 delta > full Yes Yes Yes
4 test bspatch Yes No Yes
5 calculate hash No Yes Yes
6 hash mismatch No Yes Yes

Notice that if there is an error during bspatch (Step 4), the function is returned with an error but the delta file is not deleted.

Also, since we are in the process of implementing logging, going forward there will not be a separate bsdiff log file. So the column "Write to bsdiff log" will become "Write to log" and all the values should be "Yes".

So unless I'm not aware of a specific reason why its being done like this today, I recommend the following:

Suggested Behavior:

  Errors Write to log Remove delta file Return error
1 bsdiff Error Yes Yes Yes
2 bsdiff Timeout Yes Yes Yes
3 delta > full Yes Yes Yes
4 test bspatch Yes Yes Yes
5 calculate hash Yes Yes Yes
6 hash mismatch Yes Yes Yes
phmccarty commented 4 years ago

@reaganlo Thanks for the detailed breakdown. I think the "No" entries in the "Current Behavior" table were oversights, so I also think the three actions should occur for each of these error conditions, as well as for the separate error condition that I identified in the issue description. I will clarify the issue description to make sure that bsdiff exit codes are documented.

reaganlo commented 4 years ago

@ashleshaAtrey @phmccarty Thanks for updating the error codes. So I have updated the Suggested Behavior to catch all bsdiff errors instead of just bsdiff FULLDL. We are planning to have annotations in the new log file:<timestamp>[<loglevel>][<app>] E.g.

[INF][MIXER] [ERR][MIXER] [ERR][BSDIFF] [ERR][DNF] I am assuming it is ok to have a common annotation for [BSDIFF] with the \ explaining what the error is instead of having different annotations like [BSDIFF-TIMEOUT], [BSDIFF], [BSDIFF-FULLDL] . Please let us know otherwise.
phmccarty commented 4 years ago

@reaganlo I am fine with using [BSDIFF], as long as the specific error types are easily greppable from the \<desc>.

reaganlo commented 4 years ago

@phmccarty Cool. Thanks.

reaganlo commented 4 years ago

Hi @phmccarty ,

@otaviobp pointed out that bspatch always returns an error. So maybe it was intentional that the deltafile was not removed if there was a bspatch error.

Here is an e.g. to create the error: https://gist.github.com/otaviobp/79a966bcc749316d4698f8ff5dab4baf

So either we

1) Revert this commit: Remove delta file in case bspatch test fails OR 2) Update the code to not have the bspatch test while creating deltas as it always returns an error.

What do you think?

phmccarty commented 4 years ago

@otaviobp pointed out that bspatch always returns an error. So maybe it was intentional that the deltafile was not removed if there was a bspatch error.

Here is an e.g. to create the error: https://gist.github.com/otaviobp/79a966bcc749316d4698f8ff5dab4baf

That test is not quite correct... Since most content in Clear is owned by root:root, you need to extract the content with sudo tar xf ... and then apply the delta with sudo bspatch .... So, if you repeat the steps by using sudo where necessary, bspatch will succeed.

reaganlo commented 4 years ago

@phmccarty Yes you are correct. Thanks for the clarification.

reaganlo commented 4 years ago

Fixed in v6.2.4