Azure / azure-kusto-python

Kusto client libraries for Python
MIT License
181 stars 106 forks source link

API CHANGE: add `to_dataframe` to each table #126

Open danield137 opened 5 years ago

danield137 commented 5 years ago

After investing work in https://github.com/Azure/azure-kusto-python/pull/124, and some internal discussions, we agreed to wait with this PR and reconsider changing the API to give better performance for both vanilla python and pandas use cases, and save some difficult trickery to allow parsing kusto type to dataframe:

Final api would look like

# result is of type KustoResultDataSet
result = client.execute(db, query)
# raw json 
result.tables[0].json()
# iterator with lazy parsing of json
result.tables[0].rows()
# dataframe parsing from raw json
result.tables[0].to_dataframe()

This will cause some memory pressure, so a best practice would probably be:

# either explicitly access a specific table and drop the reference after conversion
df = client.execute(db, query).primary_results[0].to_dataframe()
# or, parse it all
dfs = client.execute(db, query).to_dataframes()

Feel free to add your thoughts, code will be implemented in next couple of weeks.

danield137 commented 5 years ago

https://github.com/Azure/azure-kusto-python/pull/127

vladikbr commented 3 years ago

Postponed