bwiley1 / pandleau

A quick and easy way to convert a Pandas DataFrame to a Tableau .hyper or .tde extract.
MIT License
60 stars 19 forks source link

Hyper API works much faster. #32

Open wzrzt opened 3 years ago

wzrzt commented 3 years ago

I've tried hyper api. And if we use pandas.DataFrame.iterrows() to insert data into hyper file, it's not fast. But if we use hyper sql command "Copy" to create hyper directly from csv, it's much faster, almost 10-100x faster. The only problem is that we have to write data to csv, which is slow with pandas. But luckly we have datatable in Python and it's about R data.table's speed. I tested it on 600M rows and 31 columns data and just spent nearly 17 seconds for build hyper file from csv.
reference:
https://github.com/tableau/hyper-api-samples/blob/main/Tableau-Supported/Python/create_hyper_file_from_csv.py.

image