wiseio / paratext

A library for reading text files over multiple cores.
Apache License 2.0
1.06k stars 103 forks source link

add support for opening multiple files #49

Open andytwigg opened 7 years ago

andytwigg commented 7 years ago

it would be nice to be able to pass a list of files to load and return a dataframe with the results merged by concatenating rows consistent with using ignore_index=True. This would avoid relying on df.append which creates a copy.

deads commented 7 years ago

Hi, Thank you for your feature request. Sorry for the delay. That is a great idea!

It can be achieved in Pure Python by creating multiple ColBasedLoaders

def internal_create_multiple_csv_loaders(filenames, *args, **kwargs):
     loaders = [internal_create_csv_loader(filename, *args, **kwargs) for filename in filenames]
     return loaders

Then, concatenate the column chunks in a second yield statement. We will consider this feature for inclusion in the roadmap. Thank you for suggesting it.