Closed rpedde closed 10 years ago
Is this using the same pipe name for both writers, or different pipe names?
On Sat, May 3, 2014 at 12:19 AM, Ron Pedde notifications@github.com wrote:
This might be related to the other assert, as it's in the same pattern:
open writer on hosta open corresponding reader on hostb open new writer on hostb
write to new writer handle, reader asserts:
zpipes_server.c:417: pipe_send_data: Assertion `self->reader' failed. Aborted
Writer animation:
14-05-02 17:14:23 I: starting zpipes server using config in '/etc/zbroker.cfg' 14-05-02 17:14:23 I: joining cluster as DEC3D1B4C98A5003B10285003C1CD50F 14-05-02 17:14:23 N: starting zpipes_server service 14-05-02 17:14:23 N: binding zpipes service to 'ipc://@/zpipes/local' 14-05-02 17:14:33 I: ZPIPES server appeared at 64ECAC5F2DDC5D91320A6CD4AA3079A4 14-05-02 17:14:42 0: start: 14-05-02 17:14:42 0: OUTPUT 14-05-02 17:14:42 0: $ lookup or create pipe 14-05-02 17:14:42 0: $ open pipe writer 14-05-02 17:14:42 0: > before writing 14-05-02 17:14:42 0: before writing: 14-05-02 17:14:42 0: ok 14-05-02 17:14:42 0: $ send OUTPUT_OK 14-05-02 17:14:42 0: > writing 14-05-02 17:14:42 1: start: 14-05-02 17:14:42 1: OUTPUT 14-05-02 17:14:42 1: $ lookup or create pipe 14-05-02 17:14:42 1: $ open pipe writer 14-05-02 17:14:42 1: > before writing 14-05-02 17:14:42 1: before writing: 14-05-02 17:14:42 1: ok 14-05-02 17:14:42 1: $ send OUTPUT_OK 14-05-02 17:14:42 1: > writing 14-05-02 17:14:42 1: writing: 14-05-02 17:14:42 1: WRITE 14-05-02 17:14:42 1: $ process write request 14-05-02 17:14:42 1: > processing write 14-05-02 17:14:42 1: processing write: 14-05-02 17:14:42 1: have reader 14-05-02 17:14:42 1: $ pass data to reader 14-05-02 17:14:42 1: $ send WRITE_OK 14-05-02 17:14:42 1: > writing 14-05-02 17:14:42 1: writing: 14-05-02 17:14:42 1: CLOSE 14-05-02 17:14:42 1: $ close pipe writer 14-05-02 17:14:42 1: $ send CLOSE_OK 14-05-02 17:14:42 1: > start 14-05-02 17:14:52 0: writing: 14-05-02 17:14:52 0: expired 14-05-02 17:14:52 0: $ close pipe writer 14-05-02 17:14:52 0: $ terminate 14-05-02 17:14:52 1: start: 14-05-02 17:14:52 1: expired 14-05-02 17:14:52 1: $ terminate 14-05-02 17:14:52 I: ZPIPES server vanished from 64ECAC5F2DDC5D91320A6CD4AA3079A4
Reader animation:
14-05-02 17:14:33 I: starting zpipes server using config in 'zbroker.cfg' 14-05-02 17:14:33 I: joining cluster as 64ECAC5F2DDC5D91320A6CD4AA3079A4 14-05-02 17:14:33 N: starting zpipes_server service 14-05-02 17:14:33 N: binding zpipes service to 'ipc://@/zpipes/local' 14-05-02 17:14:33 I: ZPIPES server appeared at DEC3D1B4C98A5003B10285003C1CD50F 14-05-02 17:14:42 643: start: 14-05-02 17:14:42 643: INPUT 14-05-02 17:14:42 643: $ lookup or create pipe 14-05-02 17:14:42 643: $ open pipe reader 14-05-02 17:14:42 643: > before reading 14-05-02 17:14:42 643: before reading: 14-05-02 17:14:42 643: ok 14-05-02 17:14:42 643: $ send INPUT_OK 14-05-02 17:14:42 643: > reading lt-zbroker: zpipes_server.c:417: pipe_send_data: Assertion `self->reader' failed. Aborted
Strangely, the pipe that was written to didn't have a reader. I'm thinking some wires got crossed here. The writer should have blocked on no reader, but it sent data.
Let me know if there is more repro data I can provide.
— Reply to this email directly or view it on GitHubhttps://github.com/zeromq/zbroker/issues/60 .
Whenever there's an assertion, can you try to get the backtrace? The animation is valuable as well, thanks.
set break on that assert, but it doesn't assert in gdb. :/ I'll keep trying though.
I think it's related to this.
This should block. But sometimes it doesn't. It's racy, though, and I had to load my machine down to the point that smoke was coming out the vent fans, but this is what happens:
Writer logs:
zbroker service/0.0.1
Copyright (c) 2014 the Contributors
This Software is provided under the MPLv2 License on an "as is" basis,
without warranty of any kind, either expressed, implied, or statutory.
14-05-03 15:14:19 I: starting zpipes server using config in 'zbroker-1.cfg'
14-05-03 15:14:19 I: joining cluster as 89301E
14-05-03 15:14:19 N: starting zpipes_server service
14-05-03 15:14:19 N: binding zpipes service to 'ipc://@/zpipes/local'
14-05-03 15:14:19 I: ZPIPES server appeared at BE79E0
14-05-03 15:14:19 D: 309:2a982903-ed15-422b-87d: open local writer
14-05-03 15:14:19 D: remote=BE79E0 command=HAVE READER pipe=2a982903-ed15-422b-87d5-6afc418e193a mode=unicast
14-05-03 15:14:19 D: remote=BE79E0 command=HAVE READER pipe=2a982903-ed15-422b-87d5-6afc418e193a mode=unicast
14-05-03 15:14:19 D: 310:169e4cbd-62f3-4c30-afd: open local writer
14-05-03 15:14:19 D: remote=BE79E0 command=HAVE READER pipe=169e4cbd-62f3-4c30-afd7-8b00d82b0f6e mode=unicast
14-05-03 15:14:19 D: 310:169e4cbd-62f3-4c30-afd: write 2 bytes
14-05-03 15:14:19 D: 310:169e4cbd-62f3-4c30-afd: close local writer
14-05-03 15:14:19 D: 309:2a982903-ed15-422b-87d: close local writer
interrupted
14-05-03 15:14:19 N: terminating zpipes_server service
From the writer perspective, it wrote the data, because there was a reader. But there was no reader open. (!?!?!)
Reader logs:
zbroker service/0.0.1
Copyright (c) 2014 the Contributors
This Software is provided under the MPLv2 License on an "as is" basis,
without warranty of any kind, either expressed, implied, or statutory.
14-05-03 15:14:19 I: starting zpipes server using config in 'zbroker-2.cfg'
14-05-03 15:14:19 I: joining cluster as BE79E0
14-05-03 15:14:19 N: starting zpipes_server service
14-05-03 15:14:19 N: binding zpipes service to 'ipc://@/zpipes/local2
14-05-03 15:14:19 I: ZPIPES server appeared at 89301E
14-05-03 15:14:19 D: remote=89301E command=HAVE WRITER pipe=2a982903-ed15-422b-87d5-6afc418e193a mode=broadcast
14-05-03 15:14:19 D: remote=89301E command=HAVE WRITER pipe=2a982903-ed15-422b-87d5-6afc418e193a mode=unicast
14-05-03 15:14:19 D: 309:2a982903-ed15-422b-87d: open local reader
14-05-03 15:14:19 D: remote=89301E command=HAVE WRITER pipe=169e4cbd-62f3-4c30-afd7-8b00d82b0f6e mode=broadcast
14-05-03 15:14:19 D: remote=89301E command=HAVE WRITER pipe=169e4cbd-62f3-4c30-afd7-8b00d82b0f6e mode=unicast
14-05-03 15:14:19 D: remote=89301E command=DATA pipe=169e4cbd-62f3-4c30-afd7-8b00d82b0f6e mode=unicast
14-05-03 15:14:19 D: discard 2 bytes, unroutable
14-05-03 15:14:19 D: remote=89301E command=DROP WRITER pipe=169e4cbd-62f3-4c30-afd7-8b00d82b0f6e mode=unicast
14-05-03 15:14:19 D: remote=89301E command=DROP WRITER pipe=2a982903-ed15-422b-87d5-6afc418e193a mode=unicast
14-05-03 15:14:19 D: 309:2a982903-ed15-422b-87d: close local reader
14-05-03 15:14:19 D: 309:2a982903-ed15-422b-87d: close local reader
interrupted
14-05-03 15:14:19 N: terminating zpipes_server service
I think this condition used to assert, but now it's discarding and continuing.
:-) OK, I'll get onto it tomorrow morning. The logs should be enough, thanks!
On Sat, May 3, 2014 at 10:23 PM, Ron Pedde notifications@github.com wrote:
I think it's related to this.
- open write pipe named 2a982903-ed15-422b-87d5-6afc418e193a on writer host
- open reader pipe for 2a982903-ed15-422b-87d5-6afc418e193a on reader host
- open write pipe named 169e4cbd-62f3-4c30-afd7-8b00d82b0f6e on writer host
- write to write pipe 169e4cbd-62f3-4c30-afd7-8b00d82b0f6e
This should block. But sometimes it doesn't. It's racy, though, and I had to load my machine down to the point that smoke was coming out the vent fans, but this is what happens:
Writer logs:
zbroker service/0.0.1 Copyright (c) 2014 the Contributors This Software is provided under the MPLv2 License on an "as is" basis, without warranty of any kind, either expressed, implied, or statutory.
14-05-03 15:14:19 I: starting zpipes server using config in 'zbroker-1.cfg' 14-05-03 15:14:19 I: joining cluster as 89301E 14-05-03 15:14:19 N: starting zpipes_server service 14-05-03 15:14:19 N: binding zpipes service to 'ipc://@/zpipes/local' 14-05-03 15:14:19 I: ZPIPES server appeared at BE79E0 14-05-03 15:14:19 D: 309:2a982903-ed15-422b-87d: open local writer 14-05-03 15:14:19 D: remote=BE79E0 command=HAVE READER pipe=2a982903-ed15-422b-87d5-6afc418e193a mode=unicast 14-05-03 15:14:19 D: remote=BE79E0 command=HAVE READER pipe=2a982903-ed15-422b-87d5-6afc418e193a mode=unicast 14-05-03 15:14:19 D: 310:169e4cbd-62f3-4c30-afd: open local writer 14-05-03 15:14:19 D: remote=BE79E0 command=HAVE READER pipe=169e4cbd-62f3-4c30-afd7-8b00d82b0f6e mode=unicast 14-05-03 15:14:19 D: 310:169e4cbd-62f3-4c30-afd: write 2 bytes 14-05-03 15:14:19 D: 310:169e4cbd-62f3-4c30-afd: close local writer 14-05-03 15:14:19 D: 309:2a982903-ed15-422b-87d: close local writer interrupted 14-05-03 15:14:19 N: terminating zpipes_server service
From the writer perspective, it wrote the data, because there was a reader. But there was no reader open. (!?!?!)
Reader logs:
zbroker service/0.0.1 Copyright (c) 2014 the Contributors This Software is provided under the MPLv2 License on an "as is" basis, without warranty of any kind, either expressed, implied, or statutory.
14-05-03 15:14:19 I: starting zpipes server using config in 'zbroker-2.cfg' 14-05-03 15:14:19 I: joining cluster as BE79E0 14-05-03 15:14:19 N: starting zpipes_server service 14-05-03 15:14:19 N: binding zpipes service to 'ipc://@/zpipes/local2 14-05-03 15:14:19 I: ZPIPES server appeared at 89301E 14-05-03 15:14:19 D: remote=89301E command=HAVE WRITER pipe=2a982903-ed15-422b-87d5-6afc418e193a mode=broadcast 14-05-03 15:14:19 D: remote=89301E command=HAVE WRITER pipe=2a982903-ed15-422b-87d5-6afc418e193a mode=unicast 14-05-03 15:14:19 D: 309:2a982903-ed15-422b-87d: open local reader 14-05-03 15:14:19 D: remote=89301E command=HAVE WRITER pipe=169e4cbd-62f3-4c30-afd7-8b00d82b0f6e mode=broadcast 14-05-03 15:14:19 D: remote=89301E command=HAVE WRITER pipe=169e4cbd-62f3-4c30-afd7-8b00d82b0f6e mode=unicast 14-05-03 15:14:19 D: remote=89301E command=DATA pipe=169e4cbd-62f3-4c30-afd7-8b00d82b0f6e mode=unicast 14-05-03 15:14:19 D: discard 2 bytes, unroutable 14-05-03 15:14:19 D: remote=89301E command=DROP WRITER pipe=169e4cbd-62f3-4c30-afd7-8b00d82b0f6e mode=unicast 14-05-03 15:14:19 D: remote=89301E command=DROP WRITER pipe=2a982903-ed15-422b-87d5-6afc418e193a mode=unicast 14-05-03 15:14:19 D: 309:2a982903-ed15-422b-87d: close local reader 14-05-03 15:14:19 D: 309:2a982903-ed15-422b-87d: close local reader interrupted 14-05-03 15:14:19 N: terminating zpipes_server service
— Reply to this email directly or view it on GitHubhttps://github.com/zeromq/zbroker/issues/60#issuecomment-42115431 .
Found the cause, I think. Bad test on pipe reader, causing node to accept DATA when it didn't have a reader. Old code asserted (which was arguably the better response); new code discards the data (which is more robust since DATA can come late, after a reader has disappeared).
This might be related to the other assert, as it's in the same pattern:
open writer on hosta open corresponding reader on hostb open new writer on hostb
write to new writer handle, reader asserts:
Writer animation:
Reader animation:
Strangely, the pipe that was written to didn't have a reader. I'm thinking some wires got crossed here. The writer should have blocked on no reader, but it sent data.
Let me know if there is more repro data I can provide.