For the use case where one wants to construct a pandas.DataFrame using only a subset of avro fields (columns), it is more efficient to pass the columns kwarg to the DataFrame.from_records constructor rather than to load all fields into memory and filter. Similarly, one might wish to specify fields (columns) to exclude, although this use case is of secondary concern since it is unlikely that such an exclude list would be large enough to realize significant efficiency gains.
This PR exposes the kwargs of the DataFrame.from_records constructor in the read_avro/from_avro functions for this purpose.
For the use case where one wants to construct a
pandas.DataFrame
using only a subset of avro fields (columns), it is more efficient to pass thecolumns
kwarg to theDataFrame.from_records
constructor rather than to load all fields into memory and filter. Similarly, one might wish to specify fields (columns) to exclude, although this use case is of secondary concern since it is unlikely that such an exclude list would be large enough to realize significant efficiency gains.This PR exposes the
kwargs
of theDataFrame.from_records
constructor in theread_avro
/from_avro
functions for this purpose.