Closed mahlonsmith closed 3 years ago
I neglected to mention the core of this, which was if you exit 'nc' manually after receiving a batch of samples from the exporting module (manually closing the connection), then start it back up, the sample that would have come in during the pause isn't pushed to the next update, and is lost. Essentially makes an 'update every = 5' into an 'update every = 10', if you continue to drop connection between sends.
@vlvkobal Could you take a look at this?
The simple connector worker utilizes the same simplest flow of handling the main loop of a connector instance that was used in the backends subsystem. The new exporting engine is multithreaded though, so we need to protect data using mutexes. We can't use blocking calls while the data is locked, so they are called before the connector starts waiting for data.
It's not a problem for persistent connections, but in order to handle short-living connections correctly, we need to implement thread dispatching for every new connection like it was done for the mongodb exporting connector using a ring buffer. The existence of persistent connections should also be taken into account.
I'm unsure if this is related to #9512, but throwing it here as a more generic problem report.
Bug report summary
The exporting module has different connection behavior than the now deprecated backend environment.
The 'backend' connects to the destination when it has data queued to send, and upon a successful connection, immediately sends it. In contrast, the exporting module connects to the destination, and sends data at the next "update every" interval. The end result being that destinations that close the TCP connection between samples can lose a sample in-between updates.
This feels like surprising behavior, if the exporting module is intended as a drop-in replacement for backend.
OS / Environment
Netdata version
Component Name
exporting.
Steps To Reproduce
Minimal backend configuration:
Minmal exporting configuration:
Start a netcat listener on 2222, and one on 2223. Observe the difference.
Expected behavior
I have many, many netdata clients all funneling their samples to a single destination, which is a traditional forking server. To conserve resources (1000s of forked processes and database connections) on this destination, the TCP socket is closed between sends, letting netdata just merrily reconnect on its next interval. Clearly, this is a tradeoff at the expense of establishling TCP connections for each sample, but for this environment that's an easy trade to make.
My expectation was that behavior would be identical between backend and exporting. This appears to not be the case with both JSON and OpenTSDB, I didn't try with others.
Thanks all!