UnicodeSequence and SingleEmoji now can handle null values in all compare operations

COM8 commented 5 years ago

Fixes #4 Fixes #5

Also:

Did the same described in #4 and #5 for UnicodeSequence
Let VS do a code format and import ordering run
Removed unused imports
~~Simplified types UInt32 => uint and UInt16 => ushort~~
SingleEmoji is now a struct
Added more unit tests for UnicodeSequence and SingleEmoji

Ready to be reviewed/merged

mqudsi commented 5 years ago

The other cleanup and and automated formatting is a great idea, though. Thanks.

COM8 commented 5 years ago

Actually, if you do not mind, the choice of UInt32 instead of uint and UInt16 instead of short was intentional as the types were chosen specifically for their size rather than for the data they will contain, i.e. we specifically want a 32-bit/4-byte storage rather than just simple unsigned integer semantics.

But uint and UInt32 as well as ushort and UInt16 are just aliases (Built-in types table). Or do I miss something here?

mqudsi commented 5 years ago

Yes, they're identical and typically the lowercase variants are the primitive type names that should be used. The typical exception is when specifically performing bitwise operations on instances of the type that specifically require a fixed number of bits, in which case being extra clear about the type that you are using is very important to avoid any human errors.

For example, if you need to get the most-significant byte of a short or UInt16, the code is the same. But in reviewing the code, compare how quickly the problem stands out:

byte DoSomething(...)
{
    short value = ...;
    return (value >> 24) & 0xFF;
}

vs

byte DoSomething(...)
{
    UInt16 value = ...;
    return (value >> 24) & 0xFF;
}

It would be the same reason to prefer using uint32_t or uint16_t if programming in C++ (quite apart from the fact that the same type can have a different size on different platforms in C++).

It's especially helpful for people that don't normally have to think about the actual size of a short or long. Or people that are used to both long and LONG being 32-bits on Windows if developing in unmanaged code.

COM8 commented 5 years ago

Ok I definitely see your point there. I changed it back (f697ac0). Thanks for your explanation!

mqudsi commented 5 years ago

Thanks, merged!

neosmart / unicode.net

UnicodeSequence and SingleEmoji now can handle null values in all compare operations #6