UDST / orca_test

Data assertions for the Orca task orchestrator
BSD 3-Clause "New" or "Revised" License
0 stars 3 forks source link

Improve performance with DataFrameWrapper indexes #17

Closed smmaurer closed 7 years ago

smmaurer commented 7 years ago

This addresses the bug in Issue #16.

Instead of getting DataFrameWrapper indexes using to_frame(), now we use

DataFrameWrapper.index.get_level_values(column_name)

which works for both normal indexes and multi-indexes.

@fscottfoti - Could you try this branch with Bay Area UrbanSim and see if it solves the performance issues?

@Eh2406 - If convenient, could you try it with one of your multi-index use cases just to double-check that it doesn't cause any problems?

Eh2406 commented 7 years ago

This is a good improvement. One nit, the white space is off, looks like 1 tab instead of 4 spaces. We should use one or the other.

fscottfoti commented 7 years ago

Unfortunately, this doesn't seem to fix the problem. When I get a chance I'll profile it and see where the time is going.

smmaurer commented 7 years ago

Thanks @Eh2406 and @fscottfoti! I'm going to merge in this PR because it's cleaner than the previous approach, but will leave the issue open until we figure out the real source of the performance problems.