questdb / go-questdb-client

Golang client for QuestDB's Influx Line Protocol
Apache License 2.0
47 stars 9 forks source link

Now that Data Deduplication is available, make the LineSender safer #21

Closed probably-not closed 8 months ago

probably-not commented 1 year ago

There are several issues and pull requests throughout this repo that discuss the fact that the LineSender is not fully safe, i.e. there's issues in auto-reconnection, there's no way to retrieve the failed sent buffer to retry if there is a connection issue, etc.

@puzpuzpuz has said that this can't be implemented without data deduplication on the server side.

According to the QuestDB Docs site, data deduplication is implemented on the DB since QuestDB 7.3: https://questdb.io/docs/concept/deduplication/

Can we get this MAJOR ISSUE fixed (at the very least as an option on the client that we can enable) so we can have some better safety in what will happen if there's a connection issue, so we don't lose data and we don't have to implement reconnection logic on our own?

puzpuzpuz commented 1 year ago

Before we can benefit from dedup, we need to add commit propagation for clients (see somewhat related roadmap item: https://github.com/questdb/roadmap/issues/52). Without this, the client won't be able to tell which lines are safe to dispose of and which should be kept around to be resent. That's certainly something that we'll be working on in the future and once that's done we'll be able to expose client configuration for safer ingestion.

probably-not commented 1 year ago

@puzpuzpuz I agree that commit/error propagation is important, but adding a reconnect+resend of the current batch is something that can be added regardless of commit/error propagation. If there's a TCP write error or a broken pipe, the entire batch is lost on the client side, which is a pretty critical thing to lose, and it doesn't relate to propagating errors from the actual write itself, just to making sure that we don't lose data when there is a TCP error.

This feature can be added now, so that clients don't lose data due to TCP issues (regardless of database write issues).

puzpuzpuz commented 1 year ago

I got your point. Yes, such change is certainly possible with a few changes in the way we deal with the buffer. I discussed this with the team and we'll be making this change, but not in the near future. In the meanwhile, we're certainly open to contributions.

puzpuzpuz commented 8 months ago

Closing this one since v3 shipped HTTP sender which allows explicit control over transactions and has automatic retry behavior.