Closed vikashsingh009 closed 3 years ago
sorry for the delay @vikashsingh009
The schema information exists in the flight info found here: https://github.com/rymurr/dremio_client/blob/master/dremio_client/flight/__init__.py#L83
Probably the easiest way to get the schema would to call the flight pyarrow functions directly rather than use the dremio client helper function. Unless you want to add a function to fetch the schema, happy to accept PRs :-)
@rymurr ,i have followed above steps to get flight info schema ,i able to get the schema and trying to convert it to json
and send back to rest api response for that i have written following rest api
from flask_cors import CORS
from dremio_client.dremio_client import query
from pyarrow import flight
import json
import pandas as pd
import pyarrow as pa
# Load an empty map
from keplergl import KeplerGl
from flask import Flask, request, jsonify
app = Flask(__name__)
app.config["DEBUG"] = True
CORS(app)
class HttpDremioClientAuthHandler(flight.ClientAuthHandler):
@app.route('/api/', methods=['POST', 'GET'])
def api_post():
if request.method == 'POST':
print('post app')
username = ''
password = ''
hostname = ''
port = '47470'
sql = '''select * from Test_Space.Datasets.FacilityCode'''
tlsfilename = 'ba04da291ec01ee6.crt'
#output = query(sql =sql, hostname=hostname, port=port, username=username, password=password,tls_root_certs_filename=tlsfilename)
output = query(token= None,sql =sql, hostname=hostname, flight_port=port, username=username, password=password,odbc_port=None,ssl_verify=False)
pandaataframe=pd.DataFrame(output)
x = json.dumps(json.loads(pandaataframe.to_json(orient='records')), indent=2)
client = flight.FlightClient('grpc+tcp://localhost:47470')
client.authenticate(HttpDremioClientAuthHandler(username, password))
info = client.get_flight_info(flight.FlightDescriptor.for_command(sql))
reader = client.do_get(info.endpoints[0].ticket)
batches = []
while True:
try:
batch, metadata = reader.read_chunk()
batches.append(batch)
except StopIteration:
break
data = pa.Table.from_batches(batches)
#panda data frame
pdf = data.to_pandas()
schema = pa.Schema.from_pandas(pdf)
print(schema)
schemadata= json.dumps(info.schema)
print(info.schema)
jsondata ={"rows":x,"schema":schemadata}
return jsonify(jsondata)
else:
return 'not OK'
def __init__(self, username, password):
super(flight.ClientAuthHandler, self).__init__()
self.basic_auth = flight.BasicAuth(username, password)
#self.tls_certificates = 'ba04da291ec01ee6.crt'
self.token = None
def authenticate(self, outgoing, incoming):
auth = self.basic_auth.serialize()
outgoing.write(auth)
self.token = incoming.read()
return self.token
def get_token(self):
return self.token
app.run(host='0.0.0.0', port=5000)
But ,when i am sending request on API i am geeting below error
TypeError: Object of type Schema is not JSON serializable
Hey @vikashsingh009
The schema object cant be converted to json and isn't really suitable as a json representation: https://lists.apache.org/thread.html/029b486810af851e7112488ae95e9add59919be826b5ad794b993614%40%3Cdev.arrow.apache.org%3E
So I would try and figure out a different format to pass the schema or hand convert it using https://arrow.apache.org/docs/python/generated/pyarrow.Schema.html
thanks @rymurr i am eagerly waiting for your response
Description
I am trying to ser dremio flight client to query flight data,but data what i recieved from the dremio flight is missing schema ,i need schema to bind on the map ,i able to get same scheam with Data using dremio simple python client
What I Did
i have written below code for python for flight
on the above code i am just getting data which i converted to json array,but i need to get data as well as Schema so that i can use in maps
here is the desired output i want to achive