hyperledger / firefly-ethconnect

Ethereum connectivity bridge for Web Services & async Messaging APIs - familiar to Enterprise Apps
Apache License 2.0
75 stars 33 forks source link

TestSingleMessageWithReply: Intermittent failure "send on closed channel" #66

Open peterbroadhurst opened 4 years ago

peterbroadhurst commented 4 years ago

Intermittent build failure. Occurred in Travis for #65

time="2020-04-20T03:45:18Z" level=debug msg="Kafka consumer loop started"
time="2020-04-20T03:45:18Z" level=info msg="Sent message: {\"headers\":{\"type\":\"TestSingleMessageWithReply\",\"account\":\"0xAA983AD2a0e0eD8ac639277F37be42F2A5d2618c\",\"ctx\":{\"some\":\"data\"}}}"
time="2020-04-20T03:45:18Z" level=info msg="Kafka consumer received message: Partition=5 Offset=500"
time="2020-04-20T03:45:18Z" level=info msg="Message now in-flight: MsgContext[: reqOffset=in-topic:5:500 complete=false received=2020-04-20T03:45:18.473157259Z]"
time="2020-04-20T03:45:18Z" level=debug msg="Kafka producer successes loop started"
time="2020-04-20T03:45:18Z" level=info msg="Dispatched message context to processor: MsgContext[TestSingleMessageWithReply:9b5a99c5-c049-47e8-4f1f-ebc37184088f reqOffset=in-topic:5:500 complete=false received=2020-04-20T03:45:18.473157259Z]"
time="2020-04-20T03:45:18Z" level=info msg="Sending reply: MsgContext[TestSingleMessageWithReply:9b5a99c5-c049-47e8-4f1f-ebc37184088f reqOffset=in-topic:5:500 complete=false received=2020-04-20T03:45:18.473157259Z replied=2020-04-20T03:45:18.473294814Z replyType=TestReply]"
time="2020-04-20T03:45:18Z" level=info msg="Reply sent: MsgContext[TestSingleMessageWithReply:9b5a99c5-c049-47e8-4f1f-ebc37184088f reqOffset=in-topic:5:500 complete=false received=2020-04-20T03:45:18.473157259Z replied=2020-04-20T03:45:18.473294814Z replyType=TestReply]"
time="2020-04-20T03:45:18Z" level=debug msg="Ready=5:500 CanMark=true Infight=1 InflightSamePartition=1 ReadyToAck=1"
time="2020-04-20T03:45:18Z" level=info msg="Marking offset 500:5"
time="2020-04-20T03:45:18Z" level=info msg="Kafka consumer received message: Partition=5 Offset=500"
time="2020-04-20T03:45:18Z" level=info msg="Message now in-flight: MsgContext[: reqOffset=in-topic:5:500 complete=false received=2020-04-20T03:45:18.473496879Z]"
time="2020-04-20T03:45:18Z" level=error msg="Unauthorized: badness - Message={Headers:{CommonHeaders:{ID: MsgType:TestSingleMessageWithReply Account:0xAA983AD2a0e0eD8ac639277F37be42F2A5d2618c Context:map[some:data]}}}"
time="2020-04-20T03:45:18Z" level=info msg="Sending reply: MsgContext[TestSingleMessageWithReply: reqOffset=in-topic:5:500 complete=false received=2020-04-20T03:45:18.473496879Z replied=2020-04-20T03:45:18.473552483Z replyType=Error]"
panic: send on closed channel
goroutine 38 [running]:
github.com/kaleido-io/ethconnect/internal/kldkafka.(*msgContext).Reply(0xc000108900, 0xcc39a0, 0xc0000d3cc0)
    /home/travis/gopath/src/github.com/kaleido-io/ethconnect/internal/kldkafka/kafkabridge.go:280 +0x52d
github.com/kaleido-io/ethconnect/internal/kldkafka.(*KafkaBridge).ConsumerMessagesLoop(0xc000182fc0, 0xccc020, 0xc0001d27c0, 0xccc060, 0xc0001d6930, 0xc000029950)
    /home/travis/gopath/src/github.com/kaleido-io/ethconnect/internal/kldkafka/kafkabridge.go:348 +0x435
created by github.com/kaleido-io/ethconnect/internal/kldkafka.setupMocks
    /home/travis/gopath/src/github.com/kaleido-io/ethconnect/internal/kldkafka/kafkabridge_test.go:205 +0x24a
FAIL    github.com/kaleido-io/ethconnect/internal/kldkafka  0.048s
?       github.com/kaleido-io/ethconnect/internal/kldkafka/mock_sarama  [no test files]
ok      github.com/kaleido-io/ethconnect/internal/kldkvstore    0.019s  coverage: 100.0% of statements
ok      github.com/kaleido-io/ethconnect/internal/kldmessages   0.032s  coverage: 100.0% of statements
ok      github.com/kaleido-io/ethconnect/internal/kldopenapi    0.080s  coverage: 100.0% of statements
ok      github.com/kaleido-io/ethconnect/internal/kldrest   1.721s  coverage: 98.2% of statements
ok      github.com/kaleido-io/ethconnect/internal/kldtx 1.144s  coverage: 99.4% of statements
ok      github.com/kaleido-io/ethconnect/internal/kldutils  0.008s  coverage: 98.6% of statements
?       github.com/kaleido-io/ethconnect/pkg/kldplugins [no test files]
FAIL
peterbroadhurst commented 4 years ago

So there's a timing here when the Message already in-flight error message does not come out, and the duplicate actually gets processed - because the in-flight is flushed before the duplicate comes in.

peterbroadhurst commented 4 years ago

The problem was we were relying on async timing to validate duplicate detection. So in https://github.com/kaleido-io/ethconnect/pull/65 I've added a separate synchronous test for duplicate detection in the Kafka bridge, and pulled it out of the original test that had a separate primary purpose.