hanfei1991 / microcosm

a mini bench expreriment for a task runtime scheduler
8 stars 6 forks source link

kill command cannot kill executor process without -9 #204

Open hanfei1991 opened 2 years ago

hanfei1991 commented 2 years ago

After running 15 cvs job, can't stop it with ansible stop command

hanfei1991 commented 2 years ago

Even though executor loses heartbeats with master, it still won't stopped.

hanfei1991 commented 2 years ago

worker master Heartbeat logs keep printing:

[2022/03/07 23:46:09.331 +08:00] [WARN] [worker.go:538] ["sending heartbeat ping encountered ErrPeerMessageSendTryAgain"]
[2022/03/07 23:46:12.330 +08:00] [DEBUG] [worker.go:531] ["sending heartbeat"] [worker=cd2123c0-8d6d-4b37-a107-9fd2d82ec016]
[2022/03/07 23:46:12.330 +08:00] [DEBUG] [worker.go:536] ["sending heartbeat success"] [worker=cd2123c0-8d6d-4b37-a107-9fd2d82ec016]
[2022/03/07 23:46:12.331 +08:00] [WARN] [worker.go:538] ["sending heartbeat ping encountered ErrPeerMessageSendTryAgain"]
[2022/03/07 23:46:15.330 +08:00] [DEBUG] [worker.go:531] ["sending heartbeat"] [worker=cd2123c0-8d6d-4b37-a107-9fd2d82ec016]
[2022/03/07 23:46:15.331 +08:00] [DEBUG] [worker.go:536] ["sending heartbeat success"] [worker=cd2123c0-8d6d-4b37-a107-9fd2d82ec016]
[2022/03/07 23:46:15.331 +08:00] [WARN] [worker.go:538] ["sending heartbeat ping encountered ErrPeerMessageSendTryAgain"]
[2022/03/07 23:46:18.330 +08:00] [DEBUG] [worker.go:531] ["sending heartbeat"] [worker=cd2123c0-8d6d-4b37-a107-9fd2d82ec016]
[2022/03/07 23:46:18.330 +08:00] [DEBUG] [worker.go:536] ["sending heartbeat success"] [worker=cd2123c0-8d6d-4b37-a107-9fd2d82ec016]
[2022/03/07 23:46:18.331 +08:00] [WARN] [worker.go:538] ["sending heartbeat ping encountered ErrPeerMessageSendTryAgain"]
[2022/03/07 23:46:21.330 +08:00] [DEBUG] [worker.go:531] ["sending heartbeat"] [worker=cd2123c0-8d6d-4b37-a107-9fd2d82ec016]
[2022/03/07 23:46:21.331 +08:00] [DEBUG] [worker.go:536] ["sending heartbeat success"] [worker=cd2123c0-8d6d-4b37-a107-9fd2d82ec016]
[2022/03/07 23:46:21.331 +08:00] [WARN] [worker.go:538] ["sending heartbeat ping encountered ErrPeerMessageSendTryAgain"]
[2022/03/07 23:46:24.330 +08:00] [DEBUG] [worker.go:531] ["sending heartbeat"] [worker=cd2123c0-8d6d-4b37-a107-9fd2d82ec016]
[2022/03/07 23:46:24.330 +08:00] [DEBUG] [worker.go:536] ["sending heartbeat success"] [worker=cd2123c0-8d6d-4b37-a107-9fd2d82ec016]
[2022/03/07 23:46:24.331 +08:00] [WARN] [worker.go:538] ["sending heartbeat ping encountered ErrPeerMessageSendTryAgain"]
[2022/03/07 23:46:27.330 +08:00] [DEBUG] [worker.go:531] ["sending heartbeat"] [worker=cd2123c0-8d6d-4b37-a107-9fd2d82ec016]
[2022/03/07 23:46:27.331 +08:00] [DEBUG] [worker.go:536] ["sending heartbeat success"] [worker=cd2123c0-8d6d-4b37-a107-9fd2d82ec016]
[2022/03/07 23:46:27.332 +08:00] [WARN] [worker.go:538] ["sending heartbeat ping encountered ErrPeerMessageSendTryAgain"]
[2022/03/07 23:46:30.330 +08:00] [DEBUG] [worker.go:531] ["sending heartbeat"] [worker=cd2123c0-8d6d-4b37-a107-9fd2d82ec016]
[2022/03/07 23:46:30.330 +08:00] [DEBUG] [worker.go:536] ["sending heartbeat success"] [worker=cd2123c0-8d6d-4b37-a107-9fd2d82ec016]
[2022/03/07 23:46:30.331 +08:00] [WARN] [worker.go:538] ["sending heartbeat ping encountered ErrPeerMessageSendTryAgain"]
[2022/03/07 23:46:33.330 +08:00] [DEBUG] [worker.go:531] ["sending heartbeat"] [worker=cd2123c0-8d6d-4b37-a107-9fd2d82ec016]
[2022/03/07 23:46:33.331 +08:00] [DEBUG] [worker.go:536] ["sending heartbeat success"] [worker=cd2123c0-8d6d-4b37-a107-9fd2d82ec016]
[2022/03/07 23:46:33.331 +08:00] [WARN] [worker.go:538] ["sending heartbeat ping encountered ErrPeerMessageSendTryAgain"]
[2022/03/07 23:46:36.330 +08:00] [DEBUG] [worker.go:531] ["sending heartbeat"] [worker=cd2123c0-8d6d-4b37-a107-9fd2d82ec016]
[2022/03/07 23:46:36.331 +08:00] [DEBUG] [worker.go:536] ["sending heartbeat success"] [worker=cd2123c0-8d6d-4b37-a107-9fd2d82ec016]
[2022/03/07 23:46:36.332 +08:00] [WARN] [worker.go:538] ["sending heartbeat ping encountered ErrPeerMessageSendTryAgain"]
[2022/03/07 23:46:39.330 +08:00] [DEBUG] [worker.go:531] ["sending heartbeat"] [worker=cd2123c0-8d6d-4b37-a107-9fd2d82ec016]
[2022/03/07 23:46:39.331 +08:00] [DEBUG] [worker.go:536] ["sending heartbeat success"] [worker=cd2123c0-8d6d-4b37-a107-9fd2d82ec016]
[2022/03/07 23:46:39.331 +08:00] [WARN] [worker.go:538] ["sending heartbeat ping encountered ErrPeerMessageSendTryAgain"]
[2022/03/07 23:46:42.330 +08:00] [DEBUG] [worker.go:531] ["sending heartbeat"] [worker=cd2123c0-8d6d-4b37-a107-9fd2d82ec016]
[2022/03/07 23:46:42.331 +08:00] [DEBUG] [worker.go:536] ["sending heartbeat success"] [worker=cd2123c0-8d6d-4b37-a107-9fd2d82ec016]
[2022/03/07 23:46:42.331 +08:00] [WARN] [worker.go:538] ["sending heartbeat ping encountered ErrPeerMessageSendTryAgain"]
[2022/03/07 23:46:45.329 +08:00] [DEBUG] [worker.go:531] ["sending heartbeat"] [worker=cd2123c0-8d6d-4b37-a107-9fd2d82ec016]
[2022/03/07 23:46:45.330 +08:00] [DEBUG] [worker.go:536] ["sending heartbeat success"] [worker=cd2123c0-8d6d-4b37-a107-9fd2d82ec016]
[2022/03/07 23:46:45.330 +08:00] [WARN] [worker.go:538] ["sending heartbeat ping encountered ErrPeerMessageSendTryAgain"]
[2022/03/07 23:46:48.330 +08:00] [DEBUG] [worker.go:531] ["sending heartbeat"] [worker=cd2123c0-8d6d-4b37-a107-9fd2d82ec016]
[2022/03/07 23:46:48.331 +08:00] [DEBUG] [worker.go:536] ["sending heartbeat success"] [worker=cd2123c0-8d6d-4b37-a107-9fd2d82ec016]
[2022/03/07 23:46:48.331 +08:00] [WARN] [worker.go:538] ["sending heartbeat ping encountered ErrPeerMessageSendTryAgain"]
[2022/03/07 23:46:51.329 +08:00] [DEBUG] [worker.go:531] ["sending heartbeat"] [worker=cd2123c0-8d6d-4b37-a107-9fd2d82ec016]
[2022/03/07 23:46:51.330 +08:00] [DEBUG] [worker.go:536] ["sending heartbeat success"] [worker=cd2123c0-8d6d-4b37-a107-9fd2d82ec016]
[2022/03/07 23:46:51.330 +08:00] [WARN] [worker.go:538] ["sending heartbeat ping encountered ErrPeerMessageSendTryAgain"]
[2022/03/07 23:46:54.330 +08:00] [DEBUG] [worker.go:531] ["sending heartbeat"] [worker=cd2123c0-8d6d-4b37-a107-9fd2d82ec016]
[2022/03/07 23:46:54.331 +08:00] [DEBUG] [worker.go:536] ["sending heartbeat success"] [worker=cd2123c0-8d6d-4b37-a107-9fd2d82ec016]
[2022/03/07 23:46:54.331 +08:00] [WARN] [worker.go:538] ["sending heartbeat ping encountered ErrPeerMessageSendTryAgain"]
[2022/03/07 23:46:57.330 +08:00] [DEBUG] [worker.go:531] ["sending heartbeat"] [worker=cd2123c0-8d6d-4b37-a107-9fd2d82ec016]
[2022/03/07 23:46:57.331 +08:00] [DEBUG] [worker.go:536] ["sending heartbeat success"] [worker=cd2123c0-8d6d-4b37-a107-9fd2d82ec016]
[2022/03/07 23:46:57.331 +08:00] [WARN] [worker.go:538] ["sending heartbeat ping encountered ErrPeerMessageSendTryAgain"]
[2022/03/07 23:47:00.330 +08:00] [DEBUG] [worker.go:531] ["sending heartbeat"] [worker=cd2123c0-8d6d-4b37-a107-9fd2d82ec016]
[2022/03/07 23:47:00.331 +08:00] [DEBUG] [worker.go:536] ["sending heartbeat success"] [worker=cd2123c0-8d6d-4b37-a107-9fd2d82ec016]
[2022/03/07 23:47:00.331 +08:00] [WARN] [worker.go:538] ["sending heartbeat ping encountered ErrPeerMessageSendTryAgain"]
[2022/03/07 23:47:03.330 +08:00] [DEBUG] [worker.go:531] ["sending heartbeat"] [worker=cd2123c0-8d6d-4b37-a107-9fd2d82ec016]
[2022/03/07 23:47:03.330 +08:00] [DEBUG] [worker.go:536] ["sending heartbeat success"] [worker=cd2123c0-8d6d-4b37-a107-9fd2d82ec016]