kuanb / peartree

peartree: A library for converting transit data into a directed graph for sketch network analysis.
MIT License
201 stars 23 forks source link

[performance] Use .values to extend list of lists (matrices) in lieu of DFs #82

Closed kuanb closed 6 years ago

kuanb commented 6 years ago

I stupidly went with this pattern when originally implementing MP for the route reducer: https://github.com/kuanb/peartree/blob/eb4fdab7a8484a621a0ee78d7723858b5ca2ac27/peartree/summarizer.py#L269-L301

This loop of appending is what is causing all the performance regression observed! Just request .values each time and extend a reference list, then create a new pandas DataFrame when the whole matrices is constructed.

Such a stupid mistake. Oh, well.