Closed barnettjacob closed 8 years ago
Hi, apologies if this isn't the right place for this!
I'm trying to create a Castra with:
import dask.dataframe as dd df = dd.read_csv('/home/jacob/av_files/*.csv', names = ['property_id', 'date', 'available', 'minimum_stay', 'price', 'destination', 'primary_geo_unit', 'capacity', 'tp_rev_ct', 'snapshot_date']) df.set_index('snapshot_date', compute=False).to_castra('av_test.castra', categories = T)
import dask.dataframe as dd
df = dd.read_csv('/home/jacob/av_files/*.csv', names = ['property_id', 'date', 'available', 'minimum_stay', 'price', 'destination', 'primary_geo_unit', 'capacity', 'tp_rev_ct', 'snapshot_date'])
df.set_index('snapshot_date', compute=False).to_castra('av_test.castra', categories = T)
But I am seeing this error, my data has plenty of non english characters.
UnicodeDecodeError: 'utf8' codec can't decode byte 0x8e in position 424: invalid start byte
Thanks Jacob
looks like a dask issue actually.
Hi, apologies if this isn't the right place for this!
I'm trying to create a Castra with:
import dask.dataframe as dd
df = dd.read_csv('/home/jacob/av_files/*.csv', names = ['property_id', 'date', 'available', 'minimum_stay', 'price', 'destination', 'primary_geo_unit', 'capacity', 'tp_rev_ct', 'snapshot_date'])
df.set_index('snapshot_date', compute=False).to_castra('av_test.castra', categories = T)
But I am seeing this error, my data has plenty of non english characters.
UnicodeDecodeError: 'utf8' codec can't decode byte 0x8e in position 424: invalid start byte
Thanks Jacob