Open russorat opened 4 years ago
@russorat, there are a lot of details to discuss here, but let's start with the most important.
Let me remind you here, that the original request to this client was to provide a similar API experience with other clients, like Java or C#.
This proposal outlines a totally new client, with a new API and a new package name.
Also, as there was a major release of the current client a few months ago, doing a new major release with a completely new API will harm current users.
This leads to the idea if we wouldn' be better off with a new client in a new repo?
The Write API issues paragraph describes some issues that a reviewer has with specific parts of the current code. Could be they submitted as individual issues so that they can be discussed and fixed if necessary?
I just started to port our project to client v2 and it feels so different from previous version.
At first I was baffled that options fields are all private and then noticed setter methods - fluent api does not feel idiomatic go. Can you imagine being long time Java programmer and starting to look at project written in 'C'like java. All iterations over collections are for (int i = 0; i < 5; i++)
style instead of for-each loops or stream api these days - this is how it feels at start - little bit awkward.
As we use different precision for writes (some client application deployments use 1s
precision for data and some 1ns
) as I saw that precision is now time.duration. What happens when I provide 20s
by accident - currently defaults to ns
precision for write.
As precision can be different for every write(line) there seems to be no way anymore to set precision for single write. Only way to define precision is to create new client.
Is epoch
query parameter now removed from library API (https://docs.influxdata.com/influxdb/v2.0/reference/api/influxdb-1x/query/#epoch)? Now way even to use QueryAPI.Query with custom dialect with DateTimeFormat
. We were clever to use it to truncate time form ns
to s
just reiterating what @russorat wrote
Use of interfaces: all the client API entry points are defined in interface types. Since adding a method to an interface is a breaking change, this means that it's not possible to add new entry points without breaking the API.
This version of library suffers same problems as previous version - instead of returning (pointer to) structs methods return interfaces.
This makes:
Previous version at least had TCP/UDP client implementations to 'justify' returning interface but this time only 1 struct is implementing WriteAPIBlocking
for example. This 'polymorphic' behavior could have been solved by composition in previous version.
I think there is a Go proverb accept interfaces, return structs
and Go is a little bit different than c#/java with its interfaces
p.s. what if I need to add my own 'HTTP headers' for writes? There seems to be no way to access request.
p.s.s. is internal/write/service.HandleWrite
even goroutine safe? Lets assume that I'm using WriteAPIBlocking
and have instance in my service and use it to write from multiple goroutines - if I'm not wrong then w.retryQueue
can have race condition.
@aldas, thank you for your detailed feedback. To your issues:
As we use different precision for writes...
Can share more about your use case? e.g. how many writes of those with s and with ns precision per sec (or per min)
Effective writing (and batching) requires all points with the same precision. It would require keeping a buffer per precision and this could lead to delayed writing of some points. But this may not be an issue.
It would definitely make sense to add this to point by point writing, i.e. WriteAPIBlocking
now.
Is epoch query parameter now removed from library API (https://docs.influxdata.com/influxdb/v2.0/reference/api/influxdb-1x/query/#epoch)? No way even to use QueryAPI.Query with custom dialect with DateTimeFormat. We were clever to use it to truncate time form ns to s
The v1 InluxQL query endpoint is not supported by this library. V2 Flux query endpoint currently allows only RFC3339 datetime format.
what if I need to add my own 'HTTP headers' for writes? There seems to be no way to access request.
As you found out, request editing is not supported yet. This could add later,
is internal/write/service.HandleWrite even goroutine safe? Lets assume that I'm using WriteAPIBlocking and have instance in my service and use it to write from multiple goroutines - if I'm not wrong then w.retryQueue can have race condition.
write.Service
is not supposed to be synchronized. Synchronizing should be done in the layer above. WriteAPIBlocking
fails in this, thanks for pointing this out.
Can share more about your use case? e.g. how many writes of those with s and with ns precision per sec (or per min)
We have data logging application that logs ship data from different sources. Usually it is 3-6 PLCs (programmable logic controllers) with 100 to 2000 fields. Data from PLCs is stored at 1s precision (1hz is requirement). And then there is data from gyroscope which is logged at 50hz rate (50 request per second) and this is stored with ns precision. Little bit offtopic: we truncate time actually to second and millisecond just to get it out of Influx in one timeframe for latest.
we have our own - lets say async pipeline where input sends data to circular queue. There is background process to periodically (1s intervals) take data from queue and write it into Influx in batches. So if Influx slows down eventually we start to drop/overwrite data in queue.
write.Service is not supposed to be synchronized
This is one thing that bothers me a little - things are named as client
/ api
and they have this running miniature microverse in them. I would not expect that from client
. In my understanding it should just serialize my data and send it as a HTTP request.
This is little bit too complicated for serialize my slice of structs and send it with http request
.
I mean - client is caching actually APIs instances. so for same org+bucket I actually get same API but same time I should not use this API concurrently from multiple goroutines.
why cant writing be as dead simple as
type Influx struct {
apiURL string
// org, bucket etc.
// no references etc - struct that can considered as value 'object'
}
func main() {
influx := Influx{apiURL: ""} // is some fancy New(api, org, bucket, ...opts) constructor even needed?
var points []Point
req, err := influx.NewWriteRequest(context.Background(), points...)
if err != nil {
log.Fatalf("%v", err)
}
// this is where library user has total freedom to do whatever with request before sending it
// a) send it with ( err := influx.Write(req, [optionalClientsOwnHttpClientInstance]) ) something that combines existing:
// service.handleHTTPError(service.DoHTTPRequestWithResponse(req)) logic
// but does not have mutating state (queues or anything) can be almost considered as value object
// b) or roll your own
res, err := http.DefaultClient.Do(req)
if err != nil || (res.StatusCode < 200 || res.StatusCode >= 300) {
log.Fatalf("%v", err)
}
}
func (c *Influx) NewWriteRequest(ctx context.Context, point ...write.Point) (*http.Request, error) {
var body io.Reader
// ...serialize points ...
req, err := http.NewRequestWithContext(ctx, "POST", c.apiURL, body)
// ... add default headers
// req.Header.Set("Content-Encoding", "gzip")
// req.Header.Set("Authorization", s.authorization)
// req.Header.Set("User-Agent", http2.UserAgent)
return req, err
}
just some remarks
One more thing worth note is that the model of Point
differs among different packages:
Key()
, etc.)To sum up it's not clear which from above should be used for manipulating points with this client and InfluxDB 2.x version
I think:
this makes usage of those fields annoying. Passing values is perfectly OK for most of the thing in go. It is hard to understand why it is done so - slice itself is pointer type and having pointers to value makes little sense. Maybe slice to pointer because those types are meant to be modified - which makes little sense.
also by being private fields library has contructor methods like https://github.com/influxdata/influxdb-client-go/blob/fa15edd8bd9549750d2f93b6ccb386434df63fbe/api/query/table.go#L90
witch does not look like idiomatic Go and is annoying to use.
String()
implementations like https://github.com/influxdata/influxdb-client-go/blob/fa15edd8bd9549750d2f93b6ccb386434df63fbe/api/query/table.go#L195look out of place and not used anywhere. also there exists %#v
formating option for fmt
package https://golang.org/pkg/fmt/#pkg-overview
func TestFluxColumn_String(t *testing.T) {
record := NewFluxRecord(1, map[string]interface{}{ "table": 1, "_value": 1.75})
assert.Equal(t, `&query.FluxRecord{table:1, values:map[string]interface {}{"_value":1.75, "table":1}}`, fmt.Sprintf("%#v", record))
}
also Stringer
interface seems to be implemented for structs only in query
package so seems like unfinished/abandoned idea.
Release strategy?
Release plan:
@popey , @powersj FYI
// HTTPClient is used to make API requests.
Not sure if this is still relevant, but unless there are particular reasons for doing to, I would propose to make all user-configurable properties interfaces. In this case, and http.Doer
could be appropriate.
Proposal: new major version of InfluxDB Go client
Introduction
For a Go developer, the Go API client is likely to be the first concrete interaction they'll have with the Influx code base. Thus it's an important API and one where user experience is important. This proposal considers the current API from that point of view, and proposes a new major version of the module with a view to improving it.
Problem
Some aspects of a language interface to a network API package that are important:
idiomatic: the package should be "obvious" to people that are used to the language. upgradable: it should be possible to add new features to the API without breaking backward compatibility. easy to use correctly: using the package in the most obvious way should also be the correct way to use it. simple: the package should be as small and simple as possible given the constraints of the API. consistent: the package should be consistent with itself and with other parts of the Influx code base when possible.
The current Go API has some shortcomings in the above respects:
Options: could use a simple struct with fields rather than setters and getters. The current Options API increases the package surface area considerably. It supports a "fluent" API which isn't very idiomatic in Go compared to a struct literal. Use of interfaces: all the client API entry points are defined in interface types. Since adding a method to an interface is a breaking change, this means that it's not possible to add new entry points without breaking the API. Writing points to InfluxDB: the current WriteAPI is hard to use correctly and the current implementation could use improvement. See below for more discussion of this. The API defines its own
Point
type but also exposes types from the line-protocol package, mixing two not-necessarily-consistent line-protocol implementations. There are many exposed packages in the API module, making the whole API surface area hard to see. Go has a strong convention that the last element of the import path is the package name, but that's not the case ("influx-client-go" vs "influxdb2"). Time durations are often specified as integers, making it unclear what the units are. In Go, it's idiomatic to express durations astime.Duration
. The Find* endpoints tend to return a pointer-to-slice rather than the slice itself, which is probably not necessary.Write API issues
Go has a strong tradition of checking for errors, but WriteAPI makes it somewhat hard to check correctly for errors. It provides access to an error channel that can be read to find out any errors that occurred when points were written, but to use it correctly requires starting a goroutine and then synchronizing with that to obtain the errors. Consider the Write-API-Errors example from the documentation. This example is subtly wrong: even though the writer is flushed, there is no guarantee that all the errors have been printed by the time the program exits. The error-reading goroutine might have read from the channel but have context-switched before it gets around to printing the error.
The implementation uses goroutines to do the writes asynchronously under the hood, but from inspection of the code, it appears that there are some shortcomings with the implementation:
WriteAPI.Flush
should wait until all requests have been written, but it just polls waiting for a channel to become empty, and if there's a constant stream of points being written, that might never happen, soFlush
could block indefinitely.there appears to be no back-pressure: if there is some rate-limiting being applied by the server, the writer will be able to continue sending points until the buffer overflows, discarding the extra points.
Proposed Solution
I propose that almost all of the Go API gets pulled into a single package that exports only concrete types. There are existing Go modules that use this approach even for large APIs, and it seems to work well despite the overall size. The go-github package is one such example. Note that that package still uses separate types for different aspects of the API, but they're all concrete types, which means that it's possible to extend without breaking backward compatibility, and that the methods are all visible at a glance in the doc page.
I propose that an external package should be used for all the line-protocol encoding and decoding, but I don't believe that the current line-protocol module is sufficient for this - I will create another proposal for an update to that.
I propose that the point-writing API is simplified so that no error channel is required and errors can be checked in the usual Go way.
I propose that the package be renamed and given an import path that matches the name. We could use this to avoid making the client "v3".
To make it easier for clients to test against the API, I propose that a "fake" implementation of the HTTP API be provided so that clients can be tested in a more end-to-end fashion using
net/http/httptest
. It's possible that this could largely be generated automatically from the OpenAPI specification.Filtering
The current API has a plethora of methods for all the various
FindX
variants. This proposal uses a genericFilter
type. This has advantages and disadvantages. The main advantage is that the API is smaller and more regular. This is hopefully not outweighed by the fact that this papers over some irregularity in the API - not all API endpoints implement each kind of query. Perhaps this could be remedied in the future by adding some of the missing functionality to the API.Detailed Solution
Omissions
The following current API features were deliberately omitted from the above design:
Client.Setup
. This method is only there to set up an account initially, something which isn't really in the usual use case for using the Go API client. For that use case, using the net/http package directly seems like it would be appropriate. This means that the NewClient contract can be simplified so that it always requires a token, making API usage a little less error-prone.UsersAPI.SignIn
andSignOut
: if this functionality deemed important, then it seems like it would be better off living in the connection parameters rather than in the users API.QueryRaw
. The capability to define the returned "dialect" of the CSV doesn't seem to be hugely useful, as all the available annotations are provided when using the default query, and changing the CSV separator seems of marginal utility. For performance reasons, a client might wish to avoid the overhead of parsing CSV and just read the raw response body, but this is something that's not hard to do by issuing the request directly.Descending
. The current API definesPagingWithDescending
with respect to the result-paging API, but this doesn't appear to be used or available.CreateAuthorizationWithOrgID
. This is less general than providing the whole Organization struct and not a whole lot easier to use.UpdateAuthorizationStatusWithID
vs UpdateAuthorizationStatus: the latter is just minor syntax sugar over the former. Better to stick with one entry point here. Same applies to a bunch of other similar methods.Client.Options
,Client.ServerURL
,Client.HTTPService
. These methods seem to be of marginal utility.Implementation Plan
This package API is substantially different from the previous API, but most of the changes are fairly cosmetic, with the exception of the Writer API code which will require a complete rewrite.
Asynchronous point writing
Currently the Writer API uses several goroutines, is somewhat hard to understand and arguably not entirely correct. It should be possible to implement the new API using just one extra goroutine (to invoke any current call), with appropriate backpressure when the API is rate-limited.
API code generation
The code generated from the OpenAPI spec is arguably not ideal. It might be worth investigating to see if another generator is available that generates somewhat more compact and idiomatic code.
Plan of Action