Closed xloem closed 8 years ago
Thanks for the dump, this will be interesting. Theres supposed to be a backpressure mechanism for buffers and messages. So something like signal probe would no longer get its work function called if one of the downstream consumers was not consuming.
If you are using Pothos GUI, the execute -> show topology stats dump may be interesting. Its going to show total counts for all of the ports, including enqueued elements like messages and buffers. If you are using the API, the topology stats can be dumped to a json string as well with queryJSONStats().
Maybe it would be good to have RingDeque<>::set_capacity()
The RingDeque cant go beyond its capacity (asserts in debug mode). There's actually code in the port handler that checks and resizes this queue. It should probably log when the queue has been resized absurdly out of bounds.
https://github.com/pothosware/pothos/blob/master/library/lib/Framework/InputPort.cpp#L57
Yes, exactly. The function you linked is in the backtrace. It's continually doubling the size of the RingDeque without bounds.
I committed a change to help track down and avoid issues like this: https://github.com/pothosware/pothos/commit/6f2dd5d8d113404be1d05021d80ffce6f20078f9 So now we at least know what block isnt consuming. And dont take gigabytes of memory. ;-)
@xloem Do you still think there is bug here: Was the signal probe was connected to a consumer that wasn't interested in messages? If that was the case, at least the runtime now logs an error and tosses the data. On the other hand, if the producing but not consuming was unexpected, I would like to figure that one out.
It was not expected, but perhaps it should have been.
I had instructed a signal probe to fire for every single sample and connected it to a text box. I imagine the gui couldn't handle the datarate.
Then that means you must have had something triggering the signal probe. If that was the case, we are talking about a lot of signal events being emitted -- as it turns out, these particular ports were not being back-pressured. This commit should fix that: https://github.com/pothosware/pothos/commit/4df18bafc6a026310231714fbd61917b8e149d9e
Thanks, closing!
I rebooted PothosGui today and my computer froze on the next bootup, eventually responding later with PothosUtil having over 18 GB of address space. The process cmdline was
/usr/local/bin/PothosUtil --require-active --proxy-server tcp://[::1]
. I'm not aware of having enabled any network behavior in PothosGui.In /proc//smap I found a 13 GB chunk of memory (7f128b3ef000-7f15cb3f4000). This backtrace of thread 5 shows objects in that range:
It seems the OutputPort of a SignalProbe is producing items which are not being consumed? I do have a SignalProbe in my topology which is being triggered very frequently.
Maybe it would be good to have
RingDeque<>::set_capacity()
throw an error if the capacity is above a threshold. It would have made my computer much more responsive.I tried to check the input side of the guilty connection and think it is a NetworkSink:
I've made a core dump, so I can investigate further or share it.