Open zincopper opened 1 year ago
Here is how I get the data in:
def skip_error(row):
print('skip_error row:', row)
return 'skip'
read_options = csv.ReadOptions(column_names=['room_id', 'uid', 'gift_id', 'yuchi_amt', 'dateline'])
parse_options = csv.ParseOptions(invalid_row_handler=skip_error)
convert_options = csv.ConvertOptions(include_missing_columns=True,
auto_dict_encode=True, auto_dict_max_cardinality=800_000_000)
data = vaex.from_csv_arrow(file_path,
read_options = read_options, parse_options = parse_options, convert_options = convert_options)
Also wrong answers for category_labels. These functions only consider the first chunk of ChunkedArray.