h2oai / datatable

A Python package for manipulating 2-dimensional tabular data structures
https://datatable.readthedocs.io
Mozilla Public License 2.0
1.81k stars 155 forks source link

[bug]read multiple csv files error #3472

Open showkeyjar opened 1 year ago

showkeyjar commented 1 year ago
from datatable import iread, rbind

sel_cols = ['c1', 'c2', 'c3', 'c4']
files = ["/data/" + str(f) + ".csv" for f in df_site['station_id'].tolist()]
dt_data = iread(files, columns=sel_cols, errors='ignore')
# error 1: if add 'columns' params to iread, then output is empty
dt_all = rbind(dt_data, force=True)
df_data = dt_all.to_pandas()
# error 2: if not add 'columns', then it throws error: Cannot rbind column of type str32 to a column of type int32

it must be return a pandas dataframe, or give me some advice how to convert int32 to str32?

centos 7 python 3.8 datatable 1.0

samukweku commented 1 year ago

kindly provide minimum reproducible example. small file samples you can share