If you load 20 years of data into one dataframe, you could start pushing up against memory limits on certain users' computers. You can reduce memory usage about 30% by converting float64 values to float32 when loading the yearly play-by-play data.
On my computer, this reduced the memory usage of a single year from 129.7 MB to 94.5 MB. I don't think the lost precision is going to matter for anything we are doing with football stats.
If you are interested, I can submit a pull request that implements this change. You could also make it optional with the default being to downcast but allow the user to override if they want np.float64 as the dtype.
If you load 20 years of data into one dataframe, you could start pushing up against memory limits on certain users' computers. You can reduce memory usage about 30% by converting float64 values to float32 when loading the yearly play-by-play data.
On my computer, this reduced the memory usage of a single year from 129.7 MB to 94.5 MB. I don't think the lost precision is going to matter for anything we are doing with football stats.
If you are interested, I can submit a pull request that implements this change. You could also make it optional with the default being to downcast but allow the user to override if they want np.float64 as the dtype.