ElectionDataAnalysis / electiondata

Tools for consolidation and analysis of raw election results from the most reliable sources -- the election agencies themselves.
Other
20 stars 5 forks source link

Improve speed of joining all dataframes by doing so all at once #724

Open toddgraham121 opened 3 years ago

toddgraham121 commented 3 years ago

When testing VT's data (> 100 files), the following warning is given to the user.

tests/dataloading_tests/test_dataloading.py::test_loading
  /Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pandas/core/frame.py:4481: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead.  To get a de-fragmented frame, use `newframe = frame.copy()`
    data[k] = com.apply_if_callable(v, data)

-- Docs: https://docs.pytest.org/en/stable/warnings.html