If you call get_data with a file-type object and a CSV file type, if I've understood the code correctly, it nearly can be read in iteratively without loading the entire file into memory, apart from the fact that _load_from_stream does a full read() at https://github.com/pyexcel/pyexcel-io/blob/1cffd9d2edbe8decc30968281934fcfd6a3ad774/pyexcel_io/fileformat/_csv.py#L269 in order to look for separators. If that could be made optional (if you know you don't have separators for example), then the process would be fully iterative and only read from the file as you looped through it, which would be useful on extremely large files.
If you call
get_data
with a file-type object and a CSV file type, if I've understood the code correctly, it nearly can be read in iteratively without loading the entire file into memory, apart from the fact that_load_from_stream
does a fullread()
at https://github.com/pyexcel/pyexcel-io/blob/1cffd9d2edbe8decc30968281934fcfd6a3ad774/pyexcel_io/fileformat/_csv.py#L269 in order to look for separators. If that could be made optional (if you know you don't have separators for example), then the process would be fully iterative and only read from the file as you looped through it, which would be useful on extremely large files.