Add my 'Safe' (No Pointer) solution

NimaAra commented 8 months ago

I have implemented my solution with a self imposed restriction to avoid Unsafe and stay managed only. It supports both \n as well as \r\n.

buybackoff commented 8 months ago

In your table, my results are as of which commit?

Also did you run your implementation on the extended 10K dataset?

If it can't handle that dataset it will be listed as such.

NimaAra commented 8 months ago

Your results are for 52e82ca.

Yes my solution can process the extended 10k file. FYI, your soultion when run as JIT throws an exception when I run it against the 10k on Windows. The commit I used to generate the extended file is: 32143b2.

The error is:

Fatal error. System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
at System.SpanHelpers.NonPackedIndexOfValueType[[System.Byte, System.Private.Cor eLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[System.Sp anHelpers+DontNegate1[[System.Byte, System.Private.CoreLib, Version=8.0.0.0, Cultu re=neutral, PublicKeyToken=7cec85d7bea7798e]], System.Private.CoreLib, Version=8.0. 0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](Byte ByRef, Byte, Int32) at System.MemoryExtensions.IndexOf[[System.Byte, System.Private.CoreLib, Version =8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](System.ReadOnlySpan1< Byte>, Byte)
at _1brc.Utf8Span.IndexOf(UIntPtr, Byte)
at _1brc.App.ProcessChunk(_1brc.FixedDictionary2<_1brc.Utf8Span,_1brc.Summary>, _1brc.Utf8Span) at _1brc.App.ProcessChunk(Int64, UInt32) at _1brc.App.<Process>b__20_0(System.ValueTuple2<Int64,Int32>)
at System.Linq.Parallel.SelectQueryOperator2+SelectQueryOperatorResults[[System .ValueTuple2[[System.Int64, System.Private.CoreLib, Version=8.0.0.0, Culture=neutr al, PublicKeyToken=7cec85d7bea7798e],[System.Int32, System.Private.CoreLib, Version =8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]], System.Private.CoreLi b, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[System.__Can on, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85 d7bea7798e]].GetElement(Int32)

buybackoff commented 8 months ago

Hm, I just copied the exact dataset from my bench Linux machine to my Windows workstation, and it works. If you generate the dataset on Windows it will have \r\n\ line endings, which are not valid. Unfortunately, everyone optimizes for such primitive specs.

NimaAra commented 8 months ago

They have updated their script so on Windows they line-delimit using \n instead of \r\n. The file I ran it on was indeed \n delimited.

buybackoff commented 8 months ago

So we have a heisenbug or something?

Are you in Western Europe? Sharing 17GB there should be quite feasible, even fast.

NimaAra commented 8 months ago

Actualy you are right, ignore that. I was pointing it to the wrong file! :-)

NimaAra commented 8 months ago

Interesting, on my broken laptop, my JIT (against the 10k on Windows) runs at 23s vs yours at 27s. I can't try the AOT on it for now but will try that on my Workstation tomorrow.

It will be interesting if you can run my solution on your linux box and compare numbers.

NimaAra commented 8 months ago

Okay here's what I can see on my Windows workstation (dual socket Xeon X5650 | 24 cores). The CPU is very old so no AVX/AVX2.

Yours (83dffa7)

Mode	Default File	10k File
JIT	5.1	12.5
AOT	4.8	12.3

Mine (2f17f85)

Mode	Default File	10k File
JIT	5.5	7.7
AOT	5.2	7.5

And on my laptop (i7-6600U | 4 cores) with AVX/AVX2 supported.

Yours	Mode	Default File	10k File
JIT	12.7	28.2
AOT	13.8	31.6

Mine	Mode	Default File	10k File
JIT	17.5	23.5
AOT	16.6	23.4

These are averaged across 3 runs.

buybackoff commented 8 months ago

dotnet build -c Release

The CPU is very old so no AVX/AVX2.

😨 In Europe such CPU would consume more electricity than produce useful results in $. I'm trying to get rid of an i7-2600 Optiplex, Windows Pro there is more valueable then the rest

Absence of AVX explains relative results.

NimaAra commented 8 months ago

Yeah I know it's old; I am building a new Ryzen 7950x but not there yet.

What do you get with AOT?

buybackoff commented 8 months ago

Slightly better

buybackoff / 1brc

Add my 'Safe' (No Pointer) solution #11