Closed smmaurer closed 8 years ago
Some questions that might help:
I honestly have no good ideas here because in theory there have been no changes to the way LCM estimation works. I really need to know which step(s) of the process is taking all the time, preferably with a profile dump.
I'd also want to know if it's actually sampling alternatives correctly, and the memory use could be a clue to that. It might be worth a test with a really simple model, with only a few alternatives, with only a few "choosers."
I wonder if the problem is the estimation sample size of 15000. The estimation_sample_size
parameter is used to control how many choosers are used during estimation. So estimation is trying to be done with 15000 choosers x 100 alts per chooser. That might slow things down. But I think that's how it has always been done?
Yeah I noticed that and tried it with more like 1500 and it was still slow.
Fletcher, did you notice this also? Do you or Sam have any idea what change might have triggered it or how long ago this started?
Paul
On Fri, Aug 7, 2015 at 12:08 PM, Fletcher Foti notifications@github.com wrote:
I'd also want to know if it's actually sampling alternatives correctly, and the memory use could be a clue to that. It might be worth a test with a really simple model, with only a few alternatives, with only a few "choosers."
— Reply to this email directly or view it on GitHub https://github.com/synthicity/bayarea_urbansim/issues/65#issuecomment-128797523 .
Usually something more like 100 would be used. Results don't change very much with larger sample sizes.
On Fri, Aug 7, 2015 at 12:48 PM, Fletcher Foti notifications@github.com wrote:
Yeah I noticed that and tried it with more like 1500 and it was still slow.
— Reply to this email directly or view it on GitHub https://github.com/synthicity/bayarea_urbansim/issues/65#issuecomment-128809990 .
I haven't estimated the HLCM in a while, but I tried it out and and it never finished estimating, so there seems to be a problem. Matt changed a few things in the HLCM a few months ago which is the only thing I can think of.
This 15,000 or 1,500 is the sample size of choosers not the sample size of alternatives. Right now the choosers are the actual households in the synthesized population so need to be sampled.
Thanks for looking into this. The codebase is the current synthicity bayarea_urbansim master. The timeframe would be since the last time the "hlcm_estimate" model was run in the copy of Estimation.ipynb that's in the repo. That actually looks like October 2014 -- the notebook file was changed in March 2015, but that cell wasn't updated.
I ran a test in sanfran_urbansim so I could profile the hlcm estimation. It has slowed down too, though it completes in 4-5 minutes with 3000 choosers x 4 segments. There the slowness is in the sampling of alternatives in mnl_interaction_dataset
(see this changeset: https://github.com/synthicity/urbansim/commit/d17a5ced1aca84e9e0fce0249921255804d51a52). np.random.choice
is slow and calling it many times adds up. I don't know if that's the exact same thing causing bayarea_urbansim to be so slow, but it's definitely contributing. (It'll be called (estimation_sample_size * number of segments) times at 0.02218 seconds per call on my laptop.)
But if that's the only problem I'd expect it to have finished in a few minutes when @fscottfoti tried setting estimation_sample_size
to 1500 and in around 30 minutes with 15000.
It's possible I didn't wait long enough - I probably stopped at about 3 minutes. We should probably provide feedback to the user, like printing the log liklihood at each iteration or something. And if the sampling takes a long time, maybe some feedback during that as well.
But this same estimation used to take far less time to converge, right? If it is now taking 3 hours to estimate a reasonable sized model, no-one will use it to experiment, and that leads to poorly specified models. Hopefully we can track down the origin of the slowdown and get back to fast performance for this.
On Fri, Aug 7, 2015 at 5:31 PM, Fletcher Foti notifications@github.com wrote:
It's possible I didn't wait long enough - I probably stopped at about 3 minutes. We should probably provide feedback to the user, like printing the log liklihood at each iteration or something. And if the sampling takes a long time, maybe some feedback during that as well.
— Reply to this email directly or view it on GitHub https://github.com/synthicity/bayarea_urbansim/issues/65#issuecomment-128869085 .
Well I think LCMs were wrong before synthicity/urbansim@d17a5ce, but maybe it was acceptably wrong? I will brainstorm ways to make that faster but nothing comes immediately to mind.
Interesting. Yeah, printing some feedback would be a helpful stop-gap.
Last night I left the estimation running -- as it's currently specified in bayarea_urbansim master, with 15,000 choosers and 100 alternatives -- and it finished after about 5 hours. Screenshot attached.
Do we know if this change was what caused estimation times to change a lot?
I agree sampling without replacement is the preferred way to do this, though in practical terms it is not clear that it would change parameter estimates, given the extremely low probability that an alternative out of a huge universal choice set (say a million alternatives) would be sampled more than once in an estimation set consisting of, say, 100 alternatives. And if it did, on extremely infrequent basis, would it affect the estimated coefficients? I doubt it. Did we ever test whether this was the case?
On Fri, Aug 7, 2015 at 5:50 PM, Matt Davis notifications@github.com wrote:
Well I think LCMs were wrong before synthicity/urbansim@d17a5ce https://github.com/synthicity/urbansim/commit/d17a5ce, but maybe it was acceptably wrong? I will brainstorm ways to make that faster but nothing comes immediately to mind.
— Reply to this email directly or view it on GitHub https://github.com/synthicity/bayarea_urbansim/issues/65#issuecomment-128874757 .
You can get quite detailed logging output from UrbanSim. Put this before you run hlcm_estimate
:
import logging
from urbansim.utils.logutil import log_to_stream, set_log_level
set_log_level(logging.DEBUG)
log_to_stream()
What would be really helpful for me is if you generated a profile I can look at for one of your 5 hour runs. You can do that in the Notebook by running hlcm_estimate with the prun magic:
%%prun -q -D hlcm_estimate.prof
orca.run(["hlcm_estimate"])
Then email me that hlcm_estimate.prof file.
Hi @jiffyclub this looks really helpful, thanks. When I run the latter block, it fails with:
KeyError: 'no step named hlcm_estimate'
But if i replace orca.run
with sim.run
it seems to go fine. Would that do the trick as well, or do i need to get it working through orca? I haven't used that before but just installed it using the conda command from here.
@smmaurer sim.run
is fine, just means y'all haven't converted to orca yet.
@jiffyclub Great, here's the profile that was generated: https://github.com/ual/bayarea_urbansim/raw/estimation-test/estimation-test/hlcm_estimate.prof
Thanks, I can confirm that all the time is going to drawing alternative samples at https://github.com/UDST/urbansim/blob/acafcd9ce9f67a1d7924d512a32a83facf4904ea/urbansim/urbanchoice/interaction.py#L58.
Yup, changing that line to sample WITH replacement fixes it. A 30-minute model now estimates in 8 seconds. As Paul said, the replacement shouldn't make much difference in practical terms, so I'll submit this as a pull request. Thanks so much for figuring it out!
Sam, would you run the same heck with and without replacement and compare estimated parameters?
Paul
Sent from my iPhone
On Aug 10, 2015, at 7:24 PM, Sam Maurer notifications@github.com wrote:
Yup, changing that line to sample WITH replacement fixes it. A 30-minute model now estimates in 8 seconds. As Paul said, the replacement shouldn't make much difference in practical terms, so I'll submit this as a pull request. Thanks so much for figuring it out!
— Reply to this email directly or view it on GitHub.
If we want to go back to allowing duplicates I would revert https://github.com/UDST/urbansim/commit/d17a5ced1aca84e9e0fce0249921255804d51a52 so there is only one call to np.random.choice
. And either way should we add a check to make sure the number of alternatives is much larger than the sample size? That seems like the only situation in which this kind of draw would be wise.
Another option I discovered is using the Python stdlib's random.sample, which is some 1000x faster than np.random.choice
. The only problem with that is that it creates another entry place for randomness so we/folks would have to seed two random number generators to get repeatable runs (assuming the NumPy and Python random number generators are separate).
Seems like it would be worth it if that works.
I agree. If we have a fast way to do sampling without replacement, I would prefer that. But sampling with replacement looks like the lesser of two evils if it takes so long using np.random. Let's try random.sample and see how it works.
On Tue, Aug 11, 2015 at 9:34 AM, Fletcher Foti notifications@github.com wrote:
Seems like it would be worth it if that works.
— Reply to this email directly or view it on GitHub https://github.com/UDST/bayarea_urbansim/issues/65#issuecomment-129958713 .
Also it seems like this is something the other UrbanSim would have encountered - if we need to we can dig into how they got random numbers.
Here's a comparison of estimation results with and without replacement. (Easiest way to compare is by opening the link in two different windows.)
https://github.com/ual/bayarea_urbansim/blob/estimation-test/estimation-test/Estimation-test.ipynb
The first run is WITHOUT replacement and the next two runs are both WITH replacement, to illustrate typical variation between runs of an identical model.
The main difference between WITH and WITHOUT is in the autoPeakTotal term -- not sure why. The other discrepancies are in line with the typical variation between identical runs.
From @waddell in slack: "Interesting. It does seem that the differences are generally comparable to the variation between runs. Not entirely sure about the reason autoPeakTotal might be more sensitive to this - but perhaps it has something to do with having less variability to begin with. Let’s go with this for now, and later try Matt’s suggestion of using stdlib's random.sample."
I've gone ahead and merged UDST/urbansim#148 to restore speedy behavior at the cost of allowing repeated sampling of alternatives. Note that I've also recently removed the simulation framework from UrbanSim (use Orca instead) so if you pull from the UrbanSim master you'll get those changes as well.
I think it is ok to leave the np.random.choice in ActivitySim.
I mentioned this on Slack, but wanted to write it up here as well.
Something in the last few months has caused HLCM estimation to slow down tremendously. When I re-execute the Estimation.ipynb notebook, the "hlcm_estimate" model that used to complete in 1 minute now takes several hours to converge.
I haven't been able to isolate what's causing this, but Fletcher confirms it's not working for him either.