Is there any diagnostic CBOR tools that can analyse defective CBOR payloads?

mofosyne commented 7 years ago

Most cbor tools just simply throws a non informative error message (e.g. cbor.me ).

Is there a CBOR tool that can spit out a diagnostic message that can work out and point out defective parts of the CBOR message?

The ruby cbor diagnostic tool just crashes if it encounter an error.

cabo commented 7 years ago

CBOR.me just provides the error messages that cbor-diag generates. These could, indeed, be a bit more, er, diagnostic.

If you have examples for not-quite-CBOR data where cbor-diag was particularly unhelpful, those could help us make some progress in improving the error messages. (A related question would be how those defective CBOR data came to be — knowledge about that might also be useful in generating better error messages.)

Grüße, Carsten

On 31. Aug 2017, at 11:14, mofosyne notifications@github.com wrote:

Most cbor tools just simply throws a non informative error message (e.g. cbor.me ).

Is there a CBOR tool that can spit out a diagnostic message that can work out and point out defective parts of the CBOR message?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

mofosyne commented 7 years ago

Real world example is this. I found that I declared a 3 item array each carrying an array of 2 sub items, but actually inserted 3 subitems.

The way I solved it, was by creating a naive version of the C CBOR encoder, that has a debug print that interprets each code at a "chunk level" like a diagnostic message, but does not care about extra items being inserted incorrectly. (You could make a modified version that would note this and add a !!warn or !!error message perhaps at the corresponding byte offset.

Defective CBOR:

83030283821a3870143d01197788821a3870143e0219ff00821a387014400202

My modified C naive parser:

cbor->stream.pos 32                          
decode @ offset   0 (0x83): ARRAY( 3)  = [3: 
decode @ offset   1 (0x03): UINT( 3)   = 3   
decode @ offset   2 (0x02): UINT( 2)   = 2   
decode @ offset   3 (0x83): ARRAY( 3)  = [3: 
decode @ offset   4 (0x82): ARRAY( 2)  = [2: 
decode @ offset   5 (0x1A): UINT(26)   = X...
decode @ offset  10 (0x01): UINT( 1)   = 1   
decode @ offset  11 (0x19): UINT(25)   = X...    <-------- Note this extra item
decode @ offset  14 (0x82): ARRAY( 2)  = [2: 
decode @ offset  15 (0x1A): UINT(26)   = X...
decode @ offset  20 (0x02): UINT( 2)   = 2   
decode @ offset  21 (0x19): UINT(25)   = X...    <-------- Note this extra item
decode @ offset  24 (0x82): ARRAY( 2)  = [2: 
decode @ offset  25 (0x1A): UINT(26)   = X...
decode @ offset  30 (0x02): UINT( 2)   = 2   
<----- I think it errored out in here so didn't print

decode @ offset <offset> (<head byte>): <Major Type>(<add type>) = <Value>

A debug diagnostic probably would still display all the bytes so that it's easier to compare byte by byte instead of having to cross reference the offset.

cabo commented 7 years ago

Please try that example on cbor.me now (you can use

http://cbor.me/?bytes=83(03-02-83(82(1A.3870143D-01)-19.7788-82(1A.3870143E-02)-19.FF.00.82.1A.38.70.14.40.02.02

if you don't have it handy). I think this is a slightly better solution for data that is too long.

Unfortunately, better handling of data that is too short is much harder to implement (but then it is now a bit easier to manually just throw a blob of f7 at the end of your data and see what happens).

mofosyne commented 7 years ago

Certainly a step in the right direction. I think thats a good default behaviour to assume first that the message has erroneous extra bytes.

Though I would add an extra parameter, so that if I know that it's actually still long and just want to see how the rest of the messages is seen. It may be nice to add a "naive diagnostic mode" that keeps parsing, even if the result is nonsensical (like the above example I gave). This could be via adding an override flag to each point where it checks if enough array items was parsed, and throws an error if there is too many.

Pros: This works well if the CBOR payload is correct in at the "cbor item" level. (e.g. incorrect map/array count, missing break primitives)
Cons: This will not work well if there is a corruption at the MajorType AdditionalType level.

Thus such feature will be more targeted to those who already have a working CBOR library, and is just using it (At a cbor item level, e.g. generating a payload sequentially to avoid having to create too many buffers.).

So cbor.me diagnostic button could have a drop down (or tickbox) saying "strict" vs "naive" mode or something. But again, it might require more work. At least you are able to show how much it was able to parse before corking out.

cbor / cbor.github.io

Is there any diagnostic CBOR tools that can analyse defective CBOR payloads? #36