massive speed up for ipu

UDST / synthpop

Synthetic populations from census data

BSD 3-Clause "New" or "Revised" License

100 stars 46 forks source link

massive speed up for ipu #31

Closed Eh2406 closed 7 years ago

Eh2406 commented 7 years ago

on our data the old code synthesized a bg in 70-80 sec with this change it does it in 2-3 sec, more than 20 times faster.

cc @semcogli

janowicz commented 7 years ago

Very cool! Looking forward to checking this out. Like you mentioned in the other pull request, the failing Travis runs are due to the separate issue of deprecated index.diff

Eh2406 commented 7 years ago

It sounds like we need a version of UDST/urbansim#179. Then I can rebase on top of that. How would you like me to proceed?

janowicz commented 7 years ago

Yes, good call. Thanks. We'll get a version of urbansim#179 in today, and then notify you so you can rebase.

janowicz commented 7 years ago

@Eh2406 The deprecated index.diff has been updated now, and the change is in master: https://github.com/UDST/synthpop/pull/32

Can you rebase the pull requests when you get the chance?

Eh2406 commented 7 years ago

I have a meeting, I will rebase when it is done. :-) Thanks.

Eh2406 commented 7 years ago

@semcogli is also working on making sure it is fast on other work loads, not just ours.

coveralls commented 7 years ago

Coverage increased (+0.04%) to 81.296% when pulling f66fd78f145f54a1dd5a63f58c9da71ebe3d5d88 on SEMCOG:speed_up_drop_deros into 6d2f3dc1b25a85404760a001c8a927ddf00199e1 on UDST:master.

semcogli commented 7 years ago

A little more info about this update. If the table size is not big, say less than 2000 columns, the original code actually works faster. However, the frequency table we are dealing here has 32257 columns. In this case, the new code will use much shorter time.

janowicz commented 7 years ago

Looks good to me. I tried synthesis test runs with a couple different recipes, before and after the change above- and observed a speed-up when using this change in both cases, with the greater speedup coming in the recipe that results in bigger frequency table (in line with the comment above). I think this trade-off is ok, as I think most operational workloads will tend towards synthesizing with more constraints.