egeberkaygulcan / dstest

2 stars 1 forks source link

runtime error: invalid memory address or nil pointer dereference while running the ratis example. #24

Open idilkara opened 2 weeks ago

idilkara commented 2 weeks ago

I built the docker image with name 'dd', and then run the following command and got a runtime error.

Input: docker run --rm -v ./configs:/configs -v ./output:/root/dstest/output dd run -c /configs/ratis.yml

Output: ℹ tracking data in /tmp/stats.2412182344.json ℹ tracking data in /tmp/stats.695245290.json config: /configs/ratis.yml Starting dstest Name: ratis-test [TestEngine] 2024/07/10 11:55:45 Starting experiment 1... [TestEngine] 2024/07/10 11:55:45 Starting iteration 1 [NetworkManager] 2024/07/10 11:55:45 Network manager initialized Faults: [Worker 2] 2024/07/10 11:55:45 Running worker with: /root/dstest/scripts/ratis_server.sh 10006 10007 6002 n2 [Worker 1] 2024/07/10 11:55:45 Running worker with: /root/dstest/scripts/ratis_server.sh 10003 6001 10005 n1 [Worker 0] 2024/07/10 11:55:45 Running worker with: /root/dstest/scripts/ratis_server.sh 6000 10001 10002 n0 [NetworkManager] 2024/07/10 11:55:45 Network manager running panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x8a3d75]

goroutine 99 [running]: github.com/egeberkaygulcan/dstest/cmd/dstest/network.(Manager).UpdateChainClocks(...) github.com/egeberkaygulcan/dstest/cmd/dstest/network/manager.go:145 github.com/egeberkaygulcan/dstest/cmd/dstest/network.(Router).QueueMessage(0xc000230e00, 0x0?) github.com/egeberkaygulcan/dstest/cmd/dstest/network/router.go:60 +0x1f5 github.com/egeberkaygulcan/dstest/cmd/dstest/network.(HttpInterceptor).handleHttp2(0xc000234550, {0x12b8da0, 0xc000036570}, {0x12c8698, 0xc000270058}) github.com/egeberkaygulcan/dstest/cmd/dstest/network/HttpInterceptor.go:153 +0x546 github.com/egeberkaygulcan/dstest/cmd/dstest/network.(HttpInterceptor).handleConn(0xc000234550, 0xc000270058) github.com/egeberkaygulcan/dstest/cmd/dstest/network/HttpInterceptor.go:97 +0x3f9 github.com/egeberkaygulcan/dstest/cmd/dstest/network.(HttpInterceptor).Run.func1.1() github.com/egeberkaygulcan/dstest/cmd/dstest/network/HttpInterceptor.go:65 +0x2b created by github.com/egeberkaygulcan/dstest/cmd/dstest/network.(HttpInterceptor).Run.func1 in goroutine 81 github.com/egeberkaygulcan/dstest/cmd/dstest/network/HttpInterceptor.go:64 +0x1ba

joaomlneto commented 2 weeks ago

This happens using Windows (WSL)? I didn't get the same issue if I build the image and call it dd (on arm64 mac):

❯ docker build -t dd .

Then I'm able to run it from the root of the repository:

❯ docker run --rm -v ./configs:/configs -v ./output:/root/dstest/output dd run -c /configs/ratis.yml
ℹ  tracking data in /tmp/stats.903313410.json
ℹ  tracking data in /tmp/stats.3073250263.json
config: /configs/ratis.yml
Starting dstest
Name: ratis-test
[TestEngine] 2024/07/10 12:15:50 Starting experiment 1...
[TestEngine] 2024/07/10 12:15:50 Starting iteration 1
[NetworkManager] 2024/07/10 12:15:50 Network manager initialized
Faults:
[NetworkManager] 2024/07/10 12:15:50 Network manager running
[Worker 2] 2024/07/10 12:15:50 Running worker with: /root/dstest/scripts/ratis_server.sh 10006 10007 6002 n2
[Worker 0] 2024/07/10 12:15:50 Running worker with: /root/dstest/scripts/ratis_server.sh 6000 10001 10002 n0
[Worker 1] 2024/07/10 12:15:50 Running worker with: /root/dstest/scripts/ratis_server.sh 10003 6001 10005 n1
[HTTP Interceptor 7] 2024/07/10 12:15:51 Sending from: 2
[HTTP Interceptor 6] 2024/07/10 12:15:51 Sending from: 2
[HTTP Interceptor 7] 2024/07/10 12:15:51 Copy 2 completed.
[HTTP Interceptor 7] 2024/07/10 12:15:51 WG, waited
[HTTP Interceptor 7] 2024/07/10 12:15:51 Handled http2
[HTTP Interceptor 2] 2024/07/10 12:15:52 Sending from: 0
[HTTP Interceptor 1] 2024/07/10 12:15:52 Sending from: 0
[HTTP Interceptor 5] 2024/07/10 12:15:52 Sending from: 1
[HTTP Interceptor 6] 2024/07/10 12:15:52 Copy 2 completed.
[HTTP Interceptor 6] 2024/07/10 12:15:52 WG, waited
[HTTP Interceptor 6] 2024/07/10 12:15:52 Handled http2
[HTTP Interceptor 3] 2024/07/10 12:15:52 Sending from: 1
[HTTP Interceptor 5] 2024/07/10 12:15:52 Copy 2 completed.
[HTTP Interceptor 5] 2024/07/10 12:15:52 WG, waited
[HTTP Interceptor 5] 2024/07/10 12:15:52 Handled http2
[HTTP Interceptor 1] 2024/07/10 12:15:52 Copy 2 completed.
[HTTP Interceptor 1] 2024/07/10 12:15:52 WG, waited
[HTTP Interceptor 2] 2024/07/10 12:15:52 Copy 2 completed.
[HTTP Interceptor 2] 2024/07/10 12:15:52 WG, waited
[HTTP Interceptor 1] 2024/07/10 12:15:52 Handled http2
[HTTP Interceptor 2] 2024/07/10 12:15:52 Handled http2
[HTTP Interceptor 3] 2024/07/10 12:15:52 Copy 2 completed.
[HTTP Interceptor 3] 2024/07/10 12:15:52 WG, waited
[HTTP Interceptor 3] 2024/07/10 12:15:52 Handled http2
[HTTP Interceptor 5] 2024/07/10 12:15:52 Sending from: 1
[HTTP Interceptor 5] 2024/07/10 12:15:52 Copy 2 completed.
[HTTP Interceptor 5] 2024/07/10 12:15:52 WG, waited
[HTTP Interceptor 5] 2024/07/10 12:15:52 Handled http2
[Worker 0] 2024/07/10 12:16:00 Timeout, killing process.
[Worker 2] 2024/07/10 12:16:00 Timeout, killing process.
[Worker 2] 2024/07/10 12:16:00 Killed worker 2
[Worker 2] 2024/07/10 12:16:00 Calling the clean script.
[Worker 0] 2024/07/10 12:16:00 Killed worker 0
[Worker 0] 2024/07/10 12:16:00 Calling the clean script.
[Worker 1] 2024/07/10 12:16:00 Timeout, killing process.
[Worker 1] 2024/07/10 12:16:00 Killed worker 1
[Worker 1] 2024/07/10 12:16:00 Calling the clean script.
[ProcessManager]2024/07/10 12:16:00 Worker 0 status: Timeout
[ProcessManager]2024/07/10 12:16:00 Worker 1 status: Timeout
[ProcessManager]2024/07/10 12:16:00 Worker 2 status: Timeout
[ProcessManager]2024/07/10 12:16:00 Found bug candidate at iteration 0
[TestEngine] 2024/07/10 12:16:00 Shutting down ProcessManager...
[TestEngine] 2024/07/10 12:16:00 Shutting down NetworkManager...
[NetworkManager] 2024/07/10 12:16:00 Chain:
[NetworkManager] 2024/07/10 12:16:00 (2)-(0)-(requestVote)-(3)
[NetworkManager] 2024/07/10 12:16:00 (0)-(2)-(requestVote)-(2)
[NetworkManager] 2024/07/10 12:16:00 (2)-(0)-(appendEntries)-(7)
[NetworkManager] 2024/07/10 12:16:00 (0)-(1)-(requestVote)-(13)
[NetworkManager] 2024/07/10 12:16:00 (1)-(2)-(requestVote)-(14)
[NetworkManager] 2024/07/10 12:16:00 Chain:
[NetworkManager] 2024/07/10 12:16:00 (2)-(1)-(requestVote)-(1)
[NetworkManager] 2024/07/10 12:16:00 (1)-(2)-(requestVote)-(5)
[NetworkManager] 2024/07/10 12:16:00 (2)-(1)-(appendEntries)-(8)
[NetworkManager] 2024/07/10 12:16:00 (1)-(2)-(requestVote)-(10)
[NetworkManager] 2024/07/10 12:16:00 Chain:
[NetworkManager] 2024/07/10 12:16:00 (0)-(1)-(requestVote)-(4)
[NetworkManager] 2024/07/10 12:16:00 (1)-(0)-(requestVote)-(6)
[NetworkManager] 2024/07/10 12:16:00 (0)-(2)-(requestVote)-(12)
[NetworkManager] 2024/07/10 12:16:00 Chain:
[NetworkManager] 2024/07/10 12:16:00 (2)-(1)-(appendEntries)-(9)
[NetworkManager] 2024/07/10 12:16:00 (1)-(0)-(requestVote)-(11)
[TestEngine] 2024/07/10 12:16:00 Shutdown complete.
[TestEngine] 2024/07/10 12:16:00 Checking for bugs...
[TestEngine] 2024/07/10 12:16:00 Iteration complete
egeberkaygulcan commented 2 weeks ago

Since we do not use chain clocks, I disabled their updates. Re-opening the issue as backlog, when we support them again.