bokae / taxi

Simulating taxi-request matchings on a grid.
10 stars 5 forks source link

Speeding up the simulation #10

Closed bokae closed 5 years ago

bokae commented 6 years ago

It is still quite slow for bigger (e.g. more realistic) sizes. Ideas for speeding up.

Done:

 sbatch -c 6 --mem=1000 run.sh
# -c is number of threads, has to divide 24 (e.g. 2,4,6 or 12, depending on memory consumption)
# --mem is memory allocation in MB, compulsory, otherwise, all memory is allocated to one job, and there will be no multiple threads

Possibilities:

bokae commented 6 years ago

Now, request result files are way too big. Moreover, it is not sustainable any more to leave uncompressed result files in the results folder. They should be written compressed, and then read by visualize.py compressed.

bokae commented 6 years ago

Results files are smaller now, I only output some aggregates on the requests.

I'm indexing the grid now everywhere in 1D, and the translation from 1D to 2D and vice versa is stored in dictionaries instead of being computed on the fly.

The BFS-trees that search from a certain position for the nearest locations are also stored in a dictionary, since looking up elements of a numpy.array is incredibly slow.

Replaced multivariate random generator by two 1D ones, correlated Gausses do not interest us for now. Multivariate is much slower than simple normal distribution.

New structure for keeping track of requests: one deque for storing each single request in time order, and one other with a fixed length for storing request sets that have been generated together in one timestep. If new bunch of requests are generated, they 'push out' the old ones from this deque. Also, before loosing the oldest requests, we delete the ones from the left end of the time-ordered deque that are in the to-be-deleted set of the other deque. This is the way of dropping requests that have been waiting for a match for too long.