Closed lmangani closed 10 months ago
We're beginning to form a plan for a stand-alone FlightSQL client as a community contrib. Before diving in deeper I'd just like to check and confirm whether we could simply extend the existing experimental iox module to optionally support a custom endpoint? This would save a huge rewrite and would avoid lifting the size of flux itself with more dependencies. Any spare suggestions to get us on the right path from @sanderson perhaps? Thanks!
@lmangani
we could simply extend the existing experimental iox module to optionally support a custom endpoint?
If this is feasible, I say go for it! I don't know much about the actual implementation, but If this solves the problem in a simpler way, I'm all for that. Everything would just have to be backwards compatible. When it comes to reviewing the actual implementation, that will like be @wolffcm or @gavincabbage.
Thanks @sanderson for the input, I greatly appreciate it! The most backwards compatible way, would be indeed extending the existing iox contrib to become a little more generic and unlock further development without wasting the great work already put into it. All we need is a bit of a handover to make a transition to community sourced contributions a little easier.
@wolffcm @gavincabbage if you have any spare bandwidth to guide us we'll make your time worth it :)
Kind bump @wolffcm @gavincabbage any chance you could help unlock or suggest a viable approach to open the iox server settings so we can carry the existing extension forward? Thanks a million!
Hi @lmangani,
A good first step might be for you to look at https://github.com/influxdata/flux/blob/master/dependencies/iox/client.go
And specifically, it looks like the Provider
creates retrieves a client that just errors by default:
https://github.com/influxdata/flux/blob/d8995bbfb3ad23bbd166a4daaea7a5d9e967dce6/dependencies/iox/client.go#L37-L43
The typical workflow for an app that embeds Flux (such as the app that runs Flux in the InfluxDB 2.0 Cloud, or some other app that you've written) would inject an implementation of Client
into the context that can talk to the IOx instance that you care about. Injection is done by calling Inject
.
The client that you inject could come from here: https://github.com/influxdata/influxdb-iox-client-go There are some unit tests in that repo you can play around with to work it out. Note that this is an Apache Flight client.
Probably moving forward you may get more utility from being able to use a FlightSQL client from a Flux script. A good place for that might be in stdlib/experimental/flightsql
or something. FlightSQL is the latest and greatest communication protocol used by IOx, so you can expect it to support more features in the future. It is a standard so it would open up the possibility of being used with other data stores.
Sorry for the delay in resonse---hope that this gets you pointed in the right direction.
Thanks @wolffcm for the input, I appreciate it, but I'm not sure If I'm drawing the wrong conclusions from it.
I already have a few stand-alone FlightSQL clients working in go and other languages with a local IOx build so that part seems quite simple, but I still don't understand the current GetProvider implementation enough to determine if its extensible or a dead end.
I guess most users (me included) would need and want an IOx/Flight client working as independently any other Flux SQL source, and being able to specify the server/API dynamically on each request seems a hard requirement. Pairing a Flux instance with a specific Client
instance config seems a limiting approach only viable for something like InfluxCloud but not much for end-user scripts wanting to connect to arbitrary API endpoints to pull data.... Am I misunderstanding your indications here? At this point in time, writing a stand alone contrib flightsql client seems a bit less discouraging but I might be wrong.
Once again thanks for taking the time!
I guess most users (me included) would need and want an IOx/Flight client working as independently any other Flux SQL source, and being able to specify the server/API dynamically on each request seems a hard requirement.
Yes, that makes a lot of sense. iox.sql
was created for use in the cloud where the IOx server would be set via configuration of the app for all Flux queries.
For the InfluxCommunity
fork you are totally free to add a host and/or port parameter to iox.sql
, by updating the code here:
https://github.com/influxdata/flux/blob/d8995bbfb3ad23bbd166a4daaea7a5d9e967dce6/stdlib/experimental/iox/source.go#L34-L42
And also here: https://github.com/influxdata/flux/blob/d8995bbfb3ad23bbd166a4daaea7a5d9e967dce6/stdlib/experimental/iox/iox.flux#L56
And there may be other places.
You could definitely also create a more generic standalone way to use a FlightSQL client from Flux as well in contrib
. In a similar way.
Alright so the current implementation seems to configure an IOx client using the platform's influxdb.GetProvider()
data.
type Config struct {
Org [NameOrID](https://pkg.go.dev/gitlab.qobserve.com/teamq/flux/dependencies/influxdb#NameOrID)
Bucket [NameOrID](https://pkg.go.dev/gitlab.qobserve.com/teamq/flux/dependencies/influxdb#NameOrID)
Host [string](https://pkg.go.dev/builtin#string)
Token [string](https://pkg.go.dev/builtin#string)
}
I will try to understand the full flow and hopefully add optional user-provided override parameters in a new PR and continue this on the community fork!
Thanks @wolffcm for the assist!
Sadly have to reopen this for help @wolffcm
Adding parameters is super easy, so the function can now have an additional host
and token
parameters.
The SqlProcedureSpec
config has been extended with a Host/Token parameters as per Struct:
return &SqlProcedureSpec{
Config: iox.Config{
Bucket: influxdb.NameOrID{
Name: bucket,
},
Host: host,
Token: token,
},
Query: query,
}, nil
The schema seems fine and the config passes.
Next we have to either replace or bypass the getProvider action and this is where it gets tricky and black boxy and when we pass the extended config to create the source, something goes wrong (without clear errors).
I've even tried to create a custom provider for testing, but nothing seems to get me any closer to a configured client.
func (s *SqlProcedureSpec) CreateSource(id execute.DatasetID, a execute.Administration) (execute.Source, error) {
ctx := a.Context()
custom := influxdb.HttpProvider{
DefaultConfig: s.Config,
}
ctx = influxdb.Dependency{
Provider: custom,
}.Inject(ctx)
provider := iox.GetProvider(ctx)
fmt.Println("Provider",provider)
client, err := provider.ClientFor(ctx, s.Config)
if err != nil {
fmt.Println("ClientFor Error",err.Error())
return nil, err
}
return &sqlSource{
d: execute.NewTransportDataset(id, a.Allocator()),
client: client,
query: s.Query,
mem: a.Allocator(),
}, nil
No matter what I'm still always ending up with a supposedly valid config, an empty provider and no client initialization.
ClientFor Error iox client has not been configured
Error: failed to initialize execute state: iox client has not been configured
Any ideas or suggestions?
Hi @lmangani, sorry for the delay in response, I was on vacation last week.
It looks like you are still using the ErrorProvider
which is the default one for vanilla Flux:
https://github.com/influxdata/flux/blob/d8995bbfb3ad23bbd166a4daaea7a5d9e967dce6/dependencies/iox/client.go#L64-L69
I see you doing this:
ctx = influxdb.Dependency{
Provider: custom,
}.Inject(ctx)
And I would expect something like that to be the solution. I am not sure which influxdb.Dependency
is being invoked there, but perhaps it is not the right one.
I think if you were to call this one, you would get the result you want: https://github.com/influxdata/flux/blob/d8995bbfb3ad23bbd166a4daaea7a5d9e967dce6/dependencies/iox/client.go#L26
@wolffcm hope you had a fantastic holiday! As always, thanks for taking the time and for the pointer. I'll give it a shot next.
This issue has had no recent activity and will be closed soon.
I'm playing with iox and Flux, but the experimental
iox
package seems to be using the internal iox client which does not seem to allow connectivity to arbitrary hosts and the method used to determine the client config is a little obscure.Is there a host method we can use by settings Flux parameters, plans to extend it for general usage - or shall we rather contribute a stand-alone flight sql client?
I already have a flightsql contrib in the works and the arrow library
v12
offers a ready driver which works great with iox, but flux still usesv7
. Would a stand-alone contrib using the latestgithub.com/apache/arrow/go/v12/arrow/flight/flightsql/driver
be accepted? @wolffcmThanks in advance,
Lorenzo