Open levin-go opened 1 year ago
func (t tokenProcessor) nextToken() (tokenStruct, error) {
// we do this separate non-blocking check on token channel to
// prioritize it over cancellation channel
select {
case tok, more := <-t.tokChan:
err, more := tok.(error)
if more {
// this is an error and not a token
return nil, err
} else {
return tok, nil
}
default:
// there are no tokens on the channel, will need to wait
}
select {
case tok, more := <-t.tokChan:
if more {
err, ok := tok.(error)
if ok {
// this is an error and not a token
return nil, err
} else {
return tok, nil
}
} else {
// completed reading response
return nil, nil
}
case <-t.ctx.Done():
// It seems the Message function on t.outs.msgq doesn't get the Done if it comes here instead
if t.outs.msgq != nil {
_ = sqlexp.ReturnMessageEnqueue(t.ctx, t.outs.msgq, sqlexp.MsgNextResultSet{})
}
if t.noAttn {
return nil, t.ctx.Err()
}
if err := sendAttention(t.sess.buf); err != nil {
// unable to send attention, current connection is bad
// notify caller and close channel
return nil, err
}
// now the server should send cancellation confirmation
// it is possible that we already received full response
// just before we sent cancellation request
// in this case current response would not contain confirmation
// and we would need to read one more response
// first lets finish reading current response and look
// for confirmation in it
if len(t.tokChan) > 0 && readCancelConfirmation(t.tokChan) {
// we got confirmation in current response
return nil, t.ctx.Err()
}
// we did not get cancellation confirmation in the current response
// read one more response, it must be there
t.tokChan = make(chan tokenStruct, 5)
go processSingleResponse(t.ctx, t.sess, t.tokChan, t.outs)
if len(t.tokChan) > 0 && readCancelConfirmation(t.tokChan) {
return nil, t.ctx.Err()
}
// we did not get cancellation confirmation, something is not
// right, this connection is not usable anymore
return nil, errors.New("did not get cancellation confirmation from the server")
}
}
if len(t.tokChan) > 0 && readCancelConfirmation(t.tokChan)
and if len(t.tokChan) > 0 && readCancelConfirmation(t.tokChan)
I have add if len(t.tokChan) > 0 to prevent blocking.But I'm not sure if this will cause other problems.
Yep, the same problem @styfstr Have you thought about doing PR?
if this issue is ARM-specific, isn't the root cause more likely to be somewhere lower in the networking stack in a core Go library? The current PR fails all the tests due to race condition detection.
Probably. But that logic in token.go
looks a little bit strange if I understood it correctly. We waiting for the cancellation approval from a mssql server, but the server is already shut down (in my case). So, our ctx cancel does nothing and we just hanging forever on the loop by an empty channel
The same issue exists in the original repo since 2019 https://github.com/denisenkom/go-mssqldb/issues/527
I'm trying to understand this issue ... wouldn't you expect Ping() to block in this situation? As long as the TCP connection is still alive after removing the network cable, that is.
For what it is worth, I can't reproduce this (on ARM64) when instead of pulling a physical network cable, I terminate a TCP tunnel (made with ssh between the ARM64 test environment and an SQL SERVER instance).
In this scenario, the TCP connection is properly terminated (but without any messaging on the TDS / SQL SERVER level. Ping() immediately notices loss of connection:
# ./issue91
ping status ===== <nil>
ping status ===== <nil>
ping status ===== <nil>
ping status ===== <nil>
ping status ===== Read: EOF
Read: EOF
# go version
go version go1.21.3 linux/arm64
If Ping() takes too long, use PingContext() with a timeout. Then you'll run into issue #160 (the context is ignored), but this is a different matter.
@tiwo hello! Thank you for the response In the end, I solved my problem by wrapping all mssql calls in a goroutine that can just be left if the call hangs on a ping or something else)
Describe the bug In arm v7 (arm32), calling the ping() function will cause blocking.Through log tracing, it is found that blocking occurs in
Ping()->ExecContext()->s.exec(ctx, list)->s.processExec(ctx)->reader.iterateResponse()->t.nextToken()
To Reproduce
Expected behavior The database will be shut down normally without blocking. Appears only when the network cable between client and server is unplugged.
Further technical details
SQL Server version: SQL Server 2008 Operating system: arm v7 go version go1.18.2 windows/amd64 build command:
GOOS=linux GOARCH=arm go build