Closed landaire closed 4 years ago
Pointers a
and b
are char*
. That means that a + 4
is really a + 4 * sizeof(char)
in term of bytes.
Ah, and sizeof(char) = 2 since they're UTF-16, correct?
Ah, and sizeof(char) = 2 since they're UTF-16, correct?
Yes.
That makes 100% sense then. Thanks @mikedn, thought I must have been missing something when I read the code.
I was recently reviewing the String comparison implementation in both .NET Core and .NET framework reference source and found the following block: https://github.com/dotnet/coreclr/blob/367d9b1dd1b542f3d391d0b9111d5914602db2d9/src/mscorlib/src/System/String.Comparison.cs#L93-L121
The optimization itself makes sense, but what doesn't to me is that each comparison overlaps the previous comparison. For example, the 64-bit comparison would result in something like this:
Where each asterisk step is the next "iteration" of the unrolled loop. The core block referenced here is:
https://github.com/dotnet/coreclr/blob/367d9b1dd1b542f3d391d0b9111d5914602db2d9/src/mscorlib/src/System/String.Comparison.cs#L106-L108
Note that each pointer is incremented by 4, not 8, even though the register size is 8 on 64-bit platforms. The length check is even for 12 bytes, which means that this could would read an additional 4 bytes of garbage I would think.
The same principles above apply for the 32-bit version of this optimization as well where instead of incrementing by 4, the pointers are incremented by 2.
Is this an accident, or is there something else at play here that I'm not aware of?