Closed mgravell closed 4 years ago
nm; I think I get it now - the Advance
is manual After the ReadAsync
. This is very unintuitive, but... I guess it works?
FYI, initial results - noting that SE.Redis does a lot more and is optimized for different usage to this:
ExecuteBedrock: time for 1000 ops: 76ms
ExecuteStackExchagneRedis: time for 1000 ops: 92ms
also: got it kinda sorta working in a BDN project, but now I'm seeing:
---> System.InvalidOperationException: End position was not reached during enumeration.
at System.ThrowHelper.ThrowInvalidOperationException_EndPositionNotReached()
at System.Buffers.SequenceReader`1.IsNextSlow(ReadOnlySpan`1 next, Boolean advancePast)
at System.Buffers.SequenceReader`1.TryReadTo(ReadOnlySequence`1& sequence, ReadOnlySpan`1 delimiter, Boolean advancePastDelimiter)
at System.Buffers.SequenceReaderExtensions.TryReadTo[T](SequenceReader`1& sequenceReader, ReadOnlySpan`1& span, ReadOnlySpan`1 delimiter, Boolean advancePastDelimiter) in C:\Code\BedrockRespProtocol\src\BedrockRespProtocol\Lifted\SequenceReaderExtensions.cs:line 7
I'm assuming that is the bug @Drawaes was telling me about?
Yes the very same bug.... Might need to work around it ....let me have a look tonight
On the API design: I think I'm in favor of the explicit Advance
. This ties in well with allowing me to do a zero-copy parse that just deframes segments of the original buffer. Then we can call Advance
once the deframed data has been processed. So: nothing to see here!
@Drawaes I worked around it by changing to the super-sophisticated:
if (sequenceReader.TryReadTo(out ReadOnlySequence<byte> payloadPlusPrefix, (byte)'\r')
&& sequenceReader.TryRead(out var n) && n == '\n')
which correctly locates my CR/LF pairs without falling in a mound.
Allocations are higher than the competing code, note - I think this is probably TPL overheads:
Method | Mean | Error | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
---|---|---|---|---|---|---|---|
SERedis | 84.47 us | 1.214 us | 1.136 us | - | - | - | 360 B |
Bedrock | 77.38 us | 1.053 us | 0.985 us | - | - | - | 704 B |
initial allocations count based on 1000 operations:
Name Total (Allocations) Self (Allocations) Total Size (Bytes) Self Size (Bytes)
System.Runtime.CompilerServices.AsyncTaskMethodBuilder
1.AsyncStateMachineBox1
3009 601392 System.Threading.QueueUserWorkItemCallbackDefaultContext
1` 1002 32064 Bedrock.Framework.Protocols.ProtocolReader.<DoAsyncRead>d__13
1` 1001 136136 Bedrock.Framework.Protocols.ProtocolWriter.<WriteAsync>d__6
1` 1001 112112 Resp.RedisSimpleString.RedisSimpleStringMemory
1001 40040 BedrockRespProtocol.RespClientProtocol.<<ReadAsync>g__Awaited|7_0>d
1000 128000 BedrockRespProtocol.RespClientProtocol.<PingAsync>d__3
1000 112000 going to see what happens if I use PooledAwait
on that... will also see if I can get the .NET 5 bits working
with PooledAwait
on 2 of my methods, this becomes
Name Total (Allocations) Self (Allocations) Total Size (Bytes) Self Size (Bytes)
System.Runtime.CompilerServices.AsyncTaskMethodBuilder
1.AsyncStateMachineBox1
1011 217776 System.Threading.QueueUserWorkItemCallbackDefaultContext
1` 1003 32096 Bedrock.Framework.Protocols.ProtocolReader.<DoAsyncRead>d__13
1` 1001 136136 Bedrock.Framework.Protocols.ProtocolWriter.<WriteAsync>d__6
1` 1001 112112 Resp.RedisSimpleString.RedisSimpleStringMemory
1001 40040 The RedisSimpleStringMemory
is my actual objects - fine with those
The AsyncTaskMethodBuilder1.AsyncStateMachineBox
1 are from Name Bedrock.Framework.Protocols.ProtocolReader+<DoAsyncRead>d__13
1[System.__Canon]::MoveNext`
The QueueUserWorkItemCallbackDefaultContext
are from Bedrock.Framework.SocketReceiver+<>c::<.ctor>b__3_0 1002
most of the allocs should just evaporate if we can justify turning on the task amortizer bits; the ones that remain are the QueueUserWorkItemCallbackDefaultContext
from SocketAwaitable.Complete
- I can't see how I could bypass those since PipeScheduler.ThreadPool
is hard-coded in SocketConnection
; thoughts on that, @davidfowl ? does that concern you at all?
@mgravell found the same and created https://github.com/JanEggers/BedrockTransports/commit/4ed38b6e6692621ea6909411dba512725071a919
should i create a pr or are the more buildin solutions?
what is Task amortizer bits can you post a link i would like to try that with my mqtt bits.
The fix is these allocations is to do https://github.com/davidfowl/BedrockFramework/issues/19. I haven't done it yet, it's mostly copy pasta from Kestrel.
@JanEggers see https://github.com/dotnet/runtime/issues/13633
@JanEggers that custom scheduler looks pretty hairy re concurrency; I wonder if it would be better to simply test the state
arg to see if that is a IThreadPoolWorkItem
- and if so, UnsafeQueueUserWorkItem
it - otherwise push it to the thread-pool; or wait for David's voodoo
I guess voodoo is the best option
Closing this in favor of #19, which should solve most allocations except the state machine. That one will require more work see https://github.com/davidfowl/BedrockFramework/issues/20
@davidfowl the state machine evaporates if we https://github.com/dotnet/runtime/issues/13633
The
AdvanceTo
is only invoked when read fails; most protocols will involve multiple messages, but the second call toReadAsync
reliably fails with:Repro is here