csharpvitamins / CSharpVitamins.ShortGuid

A convenience wrapper for dealing with base64 encoded Guids
MIT License
103 stars 20 forks source link

Fix: Invalid ShortGuids Must not Return True in TryParse #4

Closed hf-kklein closed 3 years ago

hf-kklein commented 3 years ago

The TryParse method sometimes returns true although the parsed argument is no valid ShortGuid. I found this to be counterintuitive. This PR adds a test case that demonstrates the behaviour and a fix addressing the issue in the TryParse method.

davetransom commented 3 years ago

I like your find @hf-kklein and agree it does seem counter intuitive.

I'm curious how you came across it?

This did come up 4 years ago, before any of the TryParse functionality was added, but the reason was the person was fiddling with the last character (which ends up as unused padding): https://www.singular.co.nz/2007/12/shortguid-a-shorter-and-url-friendly-guid-in-c-sharp/#comment-3445381362

If you step through a base64 decoder, you'll see the maths for this. The v (47) is shifted left, yielding 188, and the r (32) and g (43) are shifted right, yielding 2 - giving the same byte (190) as the last byte. The remainder is used if the base64 string is longer.

To summarise, the test uses a valid 22 character base64 string bullshitmustnotbevalid 😆, which creates a valid byte array to create the Guid b265e96e-ad18-eb9a-2d9e-8b5b7af6a589.

However, when encoding this Guid, it gives the base64 string as bullshitmustnotbevaliQ where the trailing "d" becomes a "Q" and both produce the same Guid. It effectively means that there can be multiple encoded strings for the equivalent Guid.

I might ponder a bit more about this one though, because I think the issue stems from the original static Guid Decode(string value) method, which should maybe have this sanity test added into it, so it is picked up by all attempts at decoding (TryParse, TryDecode) and should throw an exception when decoding "fails" (this is where exceptions are thrown when invalid based64 is supplied, etc...)

It does seem like a small change, but would you consider this breaking?

hf-kklein commented 3 years ago

I'm curious how you came across it

I integrated ShortGuid into a library and tests in a project using this library started to fail. These tests expected kind of an Argument or FormatException when they passed the above test string to a method. I honestly wasn't aware of the underlying base64 representation and the side effects; so thanks for your explanations :)

Maybe we can leave the existing TryParse method unchanged but introduce a new TryParseExact method? I could open another PR for this.

davetransom commented 3 years ago

I'm not keen on adding TryParseExact. It adds to the surface area, and is likely what most people would intend/want anyway.

One approach might be to add an optional strict flag to the original Decode() and TryDecode(), and when true, performs a check after the Guid is loaded from the base64 blob to check the re-encoded output matches the input, Decode(bool strict = false).

Or forego the extra complexity, and just always make it use a strict decode and bump the major version, so it doesn't catch anyone else out.

davetransom commented 3 years ago

I think it should opt for the later, of always using a strict decode.

One approach might be to add an optional strict flag to the original Decode() and TryDecode(), and when true, performs a check after the Guid is loaded from the base64 blob to check the re-encoded output matches the input, Decode(bool strict = false).

Or forego the extra complexity, and just always make it use a strict decode and bump the major version, so it doesn't catch anyone else out.

@hf-kklein would you like to submit another PR for this? You're welcome to.

Likewise if you'd prefer me to do it, that's fine too 🙂

hf-kklein commented 3 years ago

would you like to submit another PR for this?

6

davetransom commented 3 years ago

Thanks @hf-kklein, the implementation looks good, though I did mean the latter option:

...forego the extra complexity, and just always make it use a strict decode and bump the major version, so it doesn't catch anyone else out.

Though on hindsight I wasn't clear enough in my response.

Do you prefer it this way, with the optional strict argument, over just making it strict always?

hf-kklein commented 3 years ago

Do you prefer it this way, with the optional strict argument, over just making it strict always?

I prefered the implicit always strict version as in my original PR. But the optional strict parameter is fine for me, too.

davetransom commented 3 years ago

Fixed in #6