Closed dremio-brock closed 3 years ago
When using the REST API, the dataframe returned is a list of json objects instead of the expected dataframe.
from dremio_client import init import pandas as pd query = 'select * from information_schema."tables"' client = init(simple_client=True) results = client.query(query, asynchronous=False) df = pd.DataFrame(results)
Results
rowCount ... rows 0 636 ... [{'TABLE_CATALOG': 'DREMIO', 'TABLE_SCHEMA': '... 1 636 ... [{'TABLE_CATALOG': 'DREMIO', 'TABLE_SCHEMA': '... 2 636 ... [{'TABLE_CATALOG': 'DREMIO', 'TABLE_SCHEMA': '... 3 636 ... [{'TABLE_CATALOG': 'DREMIO', 'TABLE_SCHEMA': '... 4 636 ... [{'TABLE_CATALOG': 'DREMIO', 'TABLE_SCHEMA': '... 5 636 ... [{'TABLE_CATALOG': 'DREMIO', 'TABLE_SCHEMA': '... 6 636 ... [{'TABLE_CATALOG': 'DREMIO', 'TABLE_SCHEMA': '...
Adding the following code would convert the results from the REST API to a proper dataframe.
new_df = pd.concat([pd.DataFrame(pd.json_normalize(x)) for x in df['rows']], ignore_index=True)
Proper results
TABLE_CATALOG ... TABLE_TYPE 0 DREMIO ... TABLE 1 DREMIO ... TABLE 2 DREMIO ... TABLE 3 DREMIO ... TABLE 4 DREMIO ... VIEW .. ... ... ... 631 DREMIO ... TABLE 632 DREMIO ... TABLE 633 DREMIO ... TABLE 634 DREMIO ... TABLE 635 DREMIO ... TABLE
in query.py one could likely do pd.DataFrame(pd.json_normalize(x) for x in results). I haven't had time to check it yet though so will leave this open till I get around to fixing it.
query.py
pd.DataFrame(pd.json_normalize(x) for x in results)
Description
When using the REST API, the dataframe returned is a list of json objects instead of the expected dataframe.
What I Did
Results
Fix
Adding the following code would convert the results from the REST API to a proper dataframe.
Proper results