buybackoff / 1brc

1BRC in .NET among fastest on Linux
https://hotforknowledge.com/2024/01/13/1brc-in-dotnet-among-fastest-on-linux-my-optimization-journey/
MIT License
437 stars 43 forks source link

IndexOfNewlineChar optimization? #2

Closed richardtallent closed 8 months ago

richardtallent commented 8 months ago

Not entirely sure I'm following your code (I can't find where the powers of 10 are actually used, for example).

But something struck me about IndexOfNewlineChar: if only \r\n and \n variants are supported (not the \r-only variant that is a relic of old Macs), you can look only for \n, and if found and idx > 0, then go see if it is preceeded by \r and thus needs a 2-character stride. Something like:

internal static int IndexOfNewlineChar(ReadOnlySpan<byte> span, out int stride)
 {
    stride = default;
    int idx = span.IndexOf((byte)'\n');
    if (idx < 0) return;
    int lastIdx = idx - 1;
    if (idx > 0 && span[lastIdx] == '\r') {
       stride = 2;
       return lastIdx;
    }
    stride = 1;
    return idx;
}
richardtallent commented 8 months ago

Hmm... looks like I'm a commit behind and none of this is relevant anymore :)

buybackoff commented 8 months ago

I did not fully understand the issue/question. Is that a copy of my code? I had it and it does exactly what you describe. I have just found a better way.