duneanalytics / dune-client

A framework for interacting with Dune Analytics' officially supported API service
Apache License 2.0
85 stars 22 forks source link

Support results in CSV, new methods: "refresh_csv()", "refresh_into_dataframe()" and "get_results_csv()" #51

Closed msf closed 1 year ago

msf commented 1 year ago

This implements the use of the DuneAPI with results in CSV format on the conventional (blocking API) client.

This is ideal for loading the data into Pandas, which is a common usecase.

Additionally, it is also useful for large results, because it uses less CPU and memory:

Please note that the result is still fully read into memory, until DuneAPI provides pagination support, this is unavoidable.

This PR includes a suggestion of an additional method result_into_dataframe() that would load the result directly into a DataFrame in a more efficient way. This isn't implemented, because this would add a dependency on pandas.

gentrexha commented 1 year ago

Looks great to me. Please feel free to add the try-catch with pandas import. There have been other people suggesting that we allow pandas as a direct dependency, although I have been hesitant for reasons of undesired bulk (cc @gentrexha). Maybe its time we just let it happen...

It think it's a good idea. If you're working with any type of data in Python you're most likely using pandas anyhow.

msf commented 1 year ago

@bh2smith , @gentrexha I've added pandas for the dev.txt requirements to use it on e2e tests.

I'm a fan of the current style of not adding a required dependency, specially because pandas is quite a tricky one that depends on compiled C++ libraries and so forth..

Tests now pass, except the e2e because it needs to run with the team's API key :-)

merging this to see what happens to the end2end tests, they should pass!