quic-go / quic-go

A QUIC implementation in pure Go
https://quic-go.net
MIT License
10.1k stars 1.31k forks source link

race condition in sendWindowSize #2013

Closed apmattil closed 5 years ago

apmattil commented 5 years ago
==================
WARNING: DATA RACE
Read at 0x00c0001d0000 by goroutine 25:
  github.com/lucas-clemente/quic-go/internal/flowcontrol.(*baseFlowController).sendWindowSize()
      /home/go/go/pkg/mod/github.com/lucas-clemente/quic-go@v0.11.2/internal/flowcontrol/base_flow_controller.go:60 +0x51
  github.com/lucas-clemente/quic-go/internal/flowcontrol.(*connectionFlowController).SendWindowSize()
      /home/go/go/pkg/mod/github.com/lucas-clemente/quic-go@v0.11.2/internal/flowcontrol/connection_flow_controller.go:42 +0x4d
  github.com/lucas-clemente/quic-go/internal/flowcontrol.(*streamFlowController).SendWindowSize()
      /home/go/go/pkg/mod/github.com/lucas-clemente/quic-go@v0.11.2/internal/flowcontrol/stream_flow_controller.go:110 +0x96
  github.com/lucas-clemente/quic-go.(*sendStream).Write()
      /home/go/go/pkg/mod/github.com/lucas-clemente/quic-go@v0.11.2/send_stream.go:98 +0x588
  github.com/lucas-clemente/quic-go.(*stream).Write()
      <autogenerated>:1 +0xcf
  NGN/common/sm-quic-conn.(*Conn).Write()
      /home/go/src/NGN/common/sm-quic-conn/conn.go:311 +0x101
  NGN/common/sm-conn.SendSMHeader()
      /home/go/src/NGN/common/sm-conn/sm-conn.go:165 +0x55a
  NGN/common/sm-conn.SMhandshake()
      /home/go/src/NGN/common/sm-conn/sm-conn.go:224 +0x1a1
  main.forwardConnection()
      /home/go/src/NGN/cmd/client/client.go:319 +0xbd2

Previous write at 0x00c0001d0000 by goroutine 15:
  github.com/lucas-clemente/quic-go/internal/flowcontrol.(*baseFlowController).AddBytesSent()
      /home/go/go/pkg/mod/github.com/lucas-clemente/quic-go@v0.11.2/internal/flowcontrol/base_flow_controller.go:46 +0x79
  github.com/lucas-clemente/quic-go/internal/flowcontrol.(*connectionFlowController).AddBytesSent()
      <autogenerated>:1 +0x61
  github.com/lucas-clemente/quic-go/internal/flowcontrol.(*streamFlowController).AddBytesSent()
      /home/go/go/pkg/mod/github.com/lucas-clemente/quic-go@v0.11.2/internal/flowcontrol/stream_flow_controller.go:106 +0xa6
  github.com/lucas-clemente/quic-go.(*sendStream).getDataForWriting()
      /home/go/go/pkg/mod/github.com/lucas-clemente/quic-go@v0.11.2/send_stream.go:232 +0x55e
  github.com/lucas-clemente/quic-go.(*sendStream).popStreamFrameImpl()
      /home/go/go/pkg/mod/github.com/lucas-clemente/quic-go@v0.11.2/send_stream.go:178 +0x366
  github.com/lucas-clemente/quic-go.(*sendStream).popStreamFrame()
      /home/go/go/pkg/mod/github.com/lucas-clemente/quic-go@v0.11.2/send_stream.go:155 +0xa9                                                                     
  github.com/lucas-clemente/quic-go.(*stream).popStreamFrame()                                                                                                   
      <autogenerated>:1 +0x86
  github.com/lucas-clemente/quic-go.(*framerI).AppendStreamFrames()
      /home/go/go/pkg/mod/github.com/lucas-clemente/quic-go@v0.11.2/framer.go:95 +0x44d
  github.com/lucas-clemente/quic-go.(*packetPacker).composeNextPacket()
      /home/go/go/pkg/mod/github.com/lucas-clemente/quic-go@v0.11.2/packet_packer.go:349 +0x5b3
  github.com/lucas-clemente/quic-go.(*packetPacker).PackPacket()
      /home/go/go/pkg/mod/github.com/lucas-clemente/quic-go@v0.11.2/packet_packer.go:264 +0x3cd
  github.com/lucas-clemente/quic-go.(*session).sendPacket()
      /home/go/go/pkg/mod/github.com/lucas-clemente/quic-go@v0.11.2/session.go:1105 +0x22b
  github.com/lucas-clemente/quic-go.(*session).sendPackets()
      /home/go/go/pkg/mod/github.com/lucas-clemente/quic-go@v0.11.2/session.go:1013 +0x6f4
  github.com/lucas-clemente/quic-go.(*session).run()
      /home/go/go/pkg/mod/github.com/lucas-clemente/quic-go@v0.11.2/session.go:426 +0x1688
  github.com/lucas-clemente/quic-go.(*client).establishSecureConnection.func1()
      /home/go/go/pkg/mod/github.com/lucas-clemente/quic-go@v0.11.2/client.go:269 +0x85

Goroutine 25 (running) created at:
  main.XXX()
      XXX/client.go:250 +0x79f
  main.clientMain()
      XXX/client.go:168 +0x6a8
  main.(*program).Run()
      XXX/client.go:569 +0x139
marten-seemann commented 5 years ago

Can you provide some details on how to reproduce this? We're running all of our integration tests with race detector, and this never showed up.

apmattil commented 5 years ago

Hi I got it with running app with dlv and race flag.. read/write is with tight loop with heavy load (fast.com) I'll check how often I get it.

apmattil commented 5 years ago

I can repeate this without fast.com: test client sends packets my proxy app, test server replies to packets one by one. usually one of the 4 first streams hits the race.

marten-seemann commented 5 years ago

Could you upload a minimal example that triggers this bug? We have quite a lot of integration tests, and we're running them with race detector on our CI, but for some reason it doesn't catch this one. Being able to reproduce this would be super helpful in fixing this race condition.

apmattil commented 5 years ago

do you have test case for several streams having continious data (not just http request/reply) ? could it be that http-reply takes some time to generate and problem does not appear in your tests ? or it is just dlv with race that triggers false positive race condition. I can do it, but probably not very soon.

marten-seemann commented 5 years ago

We have all those test, and none of them ever showed a race condition.

marten-seemann commented 5 years ago

@apmattil Any updates on this?

apmattil commented 5 years ago

sorry, I can not do anything for this in next tree weeks. I'll try then provide you a version to repeate issue.

On Sat, Aug 10, 2019 at 7:11 PM Marten Seemann notifications@github.com wrote:

@apmattil https://github.com/apmattil Any updates on this?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/lucas-clemente/quic-go/issues/2013?email_source=notifications&email_token=AE42IFBPATBDZGXII53PVALQD3SEZA5CNFSM4IF4I4B2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4AQMEI#issuecomment-520160785, or mute the thread https://github.com/notifications/unsubscribe-auth/AE42IFDHQ6ML4ETZ5WG3XKDQD3SEZANCNFSM4IF4I4BQ .

apmattil commented 5 years ago

I got now stripped down version, but I would not like to put it as public. where could I upload it ?

marten-seemann commented 5 years ago

Maybe create a Gist?

apmattil commented 5 years ago

too many files to cut and paste, could I just upload tarball to somewhere ?

On Tue, Aug 27, 2019 at 12:12 PM Marten Seemann notifications@github.com wrote:

Maybe create a Gist https://gist.github.com/?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/lucas-clemente/quic-go/issues/2013?email_source=notifications&email_token=AE42IFGZSMAEACZSDELV4N3QGTVZNA5CNFSM4IF4I4B2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5HCHCI#issuecomment-525214601, or mute the thread https://github.com/notifications/unsubscribe-auth/AE42IFEEHFH6NIXKWRC3P6LQGTVZNANCNFSM4IF4I4BQ .

marten-seemann commented 5 years ago

Ok, feel free to upload it anywhere you like.

marten-seemann commented 5 years ago

@apmattil Thanks for the file. I was asking for a minimal example, not 3500 lines of code though.

apmattil commented 5 years ago

I stripped about 60% of code, if I would spend more time to strip it I could get it smaller. But if you can repeat the issue with that we should be good ? Problem probably is that window variable is shared and it just shows up when you have multiple sessions created with short interval.

On Tue, Aug 27, 2019 at 1:41 PM Marten Seemann notifications@github.com wrote:

@apmattil https://github.com/apmattil Thanks for the file. I was asking for a minimal example, not 3500 lines of code though.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/lucas-clemente/quic-go/issues/2013?email_source=notifications&email_token=AE42IFCUWH737PC7WHHLLOLQGUAF5A5CNFSM4IF4I4B2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5HJYBQ#issuecomment-525245446, or mute the thread https://github.com/notifications/unsubscribe-auth/AE42IFGXTTHBDICMOSBBGZLQGUAF5ANCNFSM4IF4I4BQ .

marten-seemann commented 5 years ago

A minimum working example shouldn't have maybe 100-200 lines of code, not thousands. I'd have to spend many hours to understand what your code doing.

apmattil commented 5 years ago

I can understand it would be really hard to understand first revision of code (my first real go code), but the problem at raceback trace has only 3 frames of my code. As I see it you really do not need to understand rest of my code.

Problem points to baseFlowController.AddBytesSent() and sendWindowSize.sendWindowSize() where c.bytesSent is shared between go routines. Other is doing handshake ? and one (mine) is doing write. baseFlowController is for all streams of the session ? so c.bytesSent needs mutex ?

==================

WARNING: DATA RACE Read at 0x00c0001e8000 by goroutine 27:

github.com/lucas-clemente/quic-go/internal/flowcontrol.(*connectionFlowController).SendWindowSize() /home/go/go/pkg/mod/ github.com/lucas-clemente/quic-go@v0.11.2/internal/flowcontrol/base_flow_controller.go:60 +0x3f

github.com/lucas-clemente/quic-go/internal/flowcontrol.(streamFlowController).SendWindowSize() /home/go/go/pkg/mod/ github.com/lucas-clemente/quic-go@v0.11.2/internal/flowcontrol/stream_flow_controller.go:110 +0xa2 github.com/lucas-clemente/quic-go.(sendStream).Write() /home/go/go/pkg/mod/ github.com/lucas-clemente/quic-go@v0.11.2/send_stream.go:98 +0x349 github.com/lucas-clemente/quic-go.(*stream).Write()

:1 +0x6f NGN/common/sm-quic-conn.(*Conn).Write() /home/go/src/striped/NGN/common/sm-quic-conn/conn.go:363 +0x9d NGN/common/sm-conn.SendSMHeader() /home/go/src/striped/NGN/common/sm-conn/sm-conn.go:157 +0x305 NGN/common/sm-conn.SMhandshake() /home/go/src/striped/NGN/common/sm-conn/sm-conn.go:216 +0xbb main.forwardConnection() /home/go/src/striped/NGN/cmd/client/client.go:248 +0x501 Previous write at 0x00c0001e8000 by goroutine 14: github.com/lucas-clemente/quic-go/internal/flowcontrol.(*baseFlowController).AddBytesSent() /home/go/go/pkg/mod/ github.com/lucas-clemente/quic-go@v0.11.2/internal/flowcontrol/base_flow_controller.go:46 +0x55 github.com/lucas-clemente/quic-go/internal/flowcontrol.(*connectionFlowController).AddBytesSent() :1 +0x4d github.com/lucas-clemente/quic-go/internal/flowcontrol.(*streamFlowController).AddBytesSent() /home/go/go/pkg/mod/ github.com/lucas-clemente/quic-go@v0.11.2/internal/flowcontrol/stream_flow_controller.go:106 +0xa1 github.com/lucas-clemente/quic-go.(*sendStream).getDataForWriting() /home/go/go/pkg/mod/ github.com/lucas-clemente/quic-go@v0.11.2/send_stream.go:232 +0x2be github.com/lucas-clemente/quic-go.(*sendStream).popStreamFrameImpl() /home/go/go/pkg/mod/ github.com/lucas-clemente/quic-go@v0.11.2/send_stream.go:178 +0x1cf github.com/lucas-clemente/quic-go.(*sendStream).popStreamFrame() /home/go/go/pkg/mod/ github.com/lucas-clemente/quic-go@v0.11.2/send_stream.go:155 +0x56 github.com/lucas-clemente/quic-go.(*stream).popStreamFrame() :1 +0x53 github.com/lucas-clemente/quic-go.(*framerI).AppendStreamFrames() /home/go/go/pkg/mod/ github.com/lucas-clemente/quic-go@v0.11.2/framer.go:95 +0x2ca github.com/lucas-clemente/quic-go.(*packetPacker).composeNextPacket() /home/go/go/pkg/mod/ github.com/lucas-clemente/quic-go@v0.11.2/packet_packer.go:349 +0x1c2 github.com/lucas-clemente/quic-go.(*packetPacker).PackPacket() /home/go/go/pkg/mod/ github.com/lucas-clemente/quic-go@v0.11.2/packet_packer.go:264 +0x18f github.com/lucas-clemente/quic-go.(*session).sendPacket() /home/go/go/pkg/mod/ github.com/lucas-clemente/quic-go@v0.11.2/session.go:1105 +0xce github.com/lucas-clemente/quic-go.(*session).sendPackets() /home/go/go/pkg/mod/ github.com/lucas-clemente/quic-go@v0.11.2/session.go:1013 +0x1d0 github.com/lucas-clemente/quic-go.(*session).run() /home/go/go/pkg/mod/ github.com/lucas-clemente/quic-go@v0.11.2/session.go:426 +0xa8f github.com/lucas-clemente/quic-go.(*client).establishSecureConnection.func1() /home/go/go/pkg/mod/ github.com/lucas-clemente/quic-go@v0.11.2/client.go:269 +0x65 On Tue, Aug 27, 2019 at 3:05 PM Marten Seemann wrote: A minimum working example shouldn't have maybe 100-200 lines of code, not thousands. I'd have to spend many hours to understand what your code doing. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub , or mute the thread .
marten-seemann commented 5 years ago

These function should only be called from the session’s run loop. If this weren’t the case, we’d get failures in of our integration tests. I suspect the error is somewhere in your code (e.g. things would get racy if you call Write on the same stream from multiple Go routines), but I won’t dig through 3500 lines to locate it.

apmattil commented 5 years ago

backtrace of go routine 14 is all there.. my code does not call anything in it.. that started at c.session.run() My calls Write() of stream.. about 100% sure of the same session.

On Tue, Aug 27, 2019 at 4:56 PM Marten Seemann notifications@github.com wrote:

These function should only be called from the session’s run loop. If this weren’t the case, we’d get failures in of our integration tests. I suspect the error is somewhere in your code (e.g. things would get racy if you call Write on the same stream from multiple Go routines), but I won’t dig through 3500 lines to locate it.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/lucas-clemente/quic-go/issues/2013?email_source=notifications&email_token=AE42IFGJPKJWT7LEF57LATTQGUXANA5CNFSM4IF4I4B2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5H2ICQ#issuecomment-525313034, or mute the thread https://github.com/notifications/unsubscribe-auth/AE42IFDSG5FN4LSGGSM7ZUDQGUXANANCNFSM4IF4I4BQ .

marten-seemann commented 5 years ago

Closing this issue for now. I don’t see anything actionable here, and the code provided is orders of magnitude too complex to be helpful for debugging.

Feel free to reopen once you have a minimal example that triggers the supposed issue.

apmattil commented 5 years ago

Hi

This race condition was made by me. I had added SendWindowSize() to sendStream.Write() for logging when I was debugging congestion problem. Sorry again.