influxdata / docs-v2

InfluxData Documentation that covers InfluxDB Cloud, InfluxDB OSS 2.x, InfluxDB OSS 1.x, InfluxDB Enterprise, Telegraf, Chronograf, Kapacitor, and Flux.
https://docs.influxdata.com
MIT License
72 stars 263 forks source link

Add query example for pyarrow.flight FlightClient #4926

Open jstirnaman opened 1 year ago

jstirnaman commented 1 year ago

Provide a v3 query example using pyarrow.flight FlightClient. Add pyarrow.flight to Flight SQL clients reference.

Some sample code that doesn't yet work. Token authentication still apparently requires middleware or inserting the auth header in a way that isn't clear in the docs or code.

from pyarrow.flight import FlightClient, Ticket, \
  FlightCallOptions, Action, \
  ClientMiddleware
import json
import pandas as pd
import os

DATABASE_NAME = os.getenv('CLOUD_SERVERLESS_BUCKET_NAME')
DATABASE_TOKEN = os.getenv('CLOUD_SERVERLESS_READ_WRITE_TOKEN')
INFLUXDB_HOST=os.getenv('CLOUD_SERVERLESS_HOST')

# influxql = "SELECT MEAN(elapsed) FROM home WHERE time >= '2023-04-07 19:35:19.314571143' AND time <= '2023-04-07 21:51:29.638717686' GROUP BY time(10m)"

client = FlightClient(f"grpc+tls://{INFLUXDB_HOST}:443")
token_bytes = bytes(f"Bearer {DATABASE_TOKEN}".encode('utf-8'))
token_pair = (b'authorization', token_bytes)
database = (b'database', bytes(DATABASE_NAME.encode('utf-8')))
options = FlightCallOptions(headers=[token_pair, database])

# do_action = client.do_action(action, options)

# client.authenticate(options)

def test_sql_query():

  sql = "SELECT * FROM home where room like 'telegraf%'"

  ticket_data = {
    "sql_query": sql,
    "query_type": "sql"
  }

  ticket_bytes = json.dumps(ticket_data)
  ticket = Ticket(ticket_bytes)
  reader = client.do_get(ticket=ticket, options=options)
  arrow_table = reader.read_all()
  # data_frame = arrow_table.to_pandas()
  # print(data_frame.to_markdown())

test_sql_query()
Relevant URLs
jstirnaman commented 1 year ago

Started work on branch feat-pyarrow-flight, but token auth isn't straightforward. When it is, it should replace use of flightsql-dbapi--there's a comment in flightsql-dbapi that also says that. Until token auth is straightforward, I'm not sure there's any reason to mention pyarrow.flight FlightClient.