duneanalytics / dune-client

A framework for interacting with Dune Analytics' officially supported API service
Apache License 2.0
90 stars 21 forks source link

fix: results containing `\u0000` break `run_query_dataframe` #133

Open gosuto-inzasheru opened 4 months ago

gosuto-inzasheru commented 4 months ago

pandas' read_csv does not properly read the string if it contains these unicodes and the string is discarded instead

it either should be stripped on dune side or maybe there is a cleaner way than my current hacky workaround:

raw = dune.run_query_csv(query).data
with open('dirty.tmp', 'wb') as f:
    f.write(raw.getbuffer())
with open('clean.tmp', 'w') as f:
    buffer = open('dirty.tmp').read()
    for sanitise in ['\x00', '\x1a', '\x1b', '\x1c', '\x1d', '\x1e', '\x1f']:
        buffer = buffer.replace(sanitise, '')
    f.write(buffer)
df = pd.read_csv('clean.tmp')

for clarity, the data point in question is a result of

FROM_UTF8(VARBINARY_LTRIM (data))