gopherdata / gophernotes

The Go kernel for Jupyter notebooks and nteract.
MIT License
3.8k stars 264 forks source link

Frequent kernel crashes if compiled with Go 1.14 #199

Closed cosmos72 closed 4 years ago

cosmos72 commented 4 years ago

The following error is logged on jupyter standard output:

2020/02/26 23:59:16 Error polling heartbeat channel: interrupted system call
[I 23:59:18.358 NotebookApp] KernelRestarter: restarting kernel (1/5), keep random ports

I suspect it's due to the Go runtime exposing more EINTR errors on interrupted system calls, and libzmq (or some other code) not properly handling them - POSIX specs state that system calls can (and most of the times should) be called again when it happens.

Maybe switching to go-zeromq/zmq4 (see pull request https://github.com/gopherdata/gophernotes/pull/195) could fix this?

sbinet commented 4 years ago

go-zeromq/zmq4 has been tested w/ Go-1.14 and I didn't see that EINTR issue. (let me update #195 to also add Go-1.14)

cosmos72 commented 4 years ago

Good to hear that. In the meantime, I found the place where I read about "Go runtime exposing more EINTR errors": Go 1.14 release notes

Alternatively, we could try to understand if it's a bug in zmq4 or in gophernotes, i.e. which of one should treat EINTR as "not really an error, just retry the system call" but doesn't

sbinet commented 4 years ago

there's a thread on that on golang-{dev,nuts}:

cosmos72 commented 4 years ago

Thanks!

I fixed the most common case (poller.Poll() returning syscall.EINTR) in https://github.com/gopherdata/gophernotes/commit/7faaaffc304e8964a3e3c8b4a06585a9fb700af4

but reading from the threads you mention, EINTR is returned in many more cases