This PR removes the arrays containing pandas.dataframes that result after processing each chunk. They have now been replaced with a single dataframe to which we add the results. This change has been done to both train and classify.
The reason for this is that having many small dataframe instances takes up a lot of 'overhead memory'. Using only one dataframe drastically reduces the memory usage.
This PR removes the arrays containing
pandas.dataframes
that result after processing each chunk. They have now been replaced with a single dataframe to which we add the results. This change has been done to bothtrain
andclassify
.The reason for this is that having many small dataframe instances takes up a lot of 'overhead memory'. Using only one dataframe drastically reduces the memory usage.
closes #274