Profile AI execution speed

dkirkby commented 7 years ago

The AI is getting a bit slow, especially with the larger corpus. This issue is to run a profile and see if there are any obvious bottlenecks that would be easy to improve.

dkirkby commented 7 years ago

Use the following command for profiling:

python -m cProfile -o profile.out ./evaluate.py -i word2vec.dat.4 --top-pairs 20

This process runs at 100% CPU (so is single threaded) and uses ~390Mb of memory.

dkirkby commented 7 years ago

Here are the top-10 entry points:

In [1]: import pstats
In [2]: p = pstats.Stats('profile.out')
In [3]: p.sort_stats('time').print_stats(10)
Mon Jan 16 08:16:38 2017    profile.out

         197501073 function calls (197470716 primitive calls) in 1255.716 seconds

   Ordered by: internal time
   List reduced from 5678 to 10 due to restriction <10>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
 22759354  642.944    0.000  642.944    0.000 {numpy.core.multiarray.dot}
    80201  374.755    0.005  374.755    0.005 {method 'argsort' of 'numpy.ndarray' objects}
    80200   55.315    0.001 1250.953    0.016 ./model.py:50(get_clue)
  8301684   28.906    0.000   43.819    0.000 /Users/david/anaconda/lib/python2.7/site-packages/nltk/corpus/reader/wordnet.py:1690(apply_rules)
  8180000   27.391    0.000  136.882    0.000 ./model.py:31(get_stem)
  7704758   23.065    0.000   23.065    0.000 {method 'reduce' of 'numpy.ufunc' objects}
  8168837   20.798    0.000   85.208    0.000 /Users/david/anaconda/lib/python2.7/site-packages/nltk/corpus/reader/wordnet.py:1679(_morphy)
  8334300   18.429    0.000   20.591    0.000 /Users/david/anaconda/lib/python2.7/site-packages/nltk/corpus/reader/wordnet.py:1696(filter_forms)
 73440688   14.319    0.000   14.319    0.000 {method 'endswith' of 'unicode' objects}
  7559718   11.497    0.000   37.655    0.000 /Users/david/anaconda/lib/python2.7/site-packages/numpy/core/fromnumeric.py:2300(amin)

This indicates that most of the time is spent in these two lines:

# Calculate the cosine distances between the mean vector and all
# the words in our vocabulary.
cosines = np.dot(self.model.syn0norm[:, np.newaxis],
                 mean_vector).reshape(-1)

# Sort the vocabulary by decreasing cosine similarity with the mean.
closest = np.argsort(cosines)[::-1]

and that the stemming is not a bottleneck.

dkirkby commented 7 years ago

A tree decomposition of the normalized embedding vectors for all words in the vocabulary should help speed this up a lot. The basic idea is to quickly find a subset of words that are "close" to mean_vector and then restrict the np.dot and np.argsort operations to this subset.

TODO: try this idea out using scipy KD-tree.

dkirkby commented 7 years ago

I just noticed this comment in the KD-tree docs, but I think it is still worth trying this out:

For large dimensions (20 is already large) do not expect this to run significantly faster than brute force. High-dimensional nearest-neighbor queries are a substantial open problem in computer science.

dkirkby / CodeNames

Profile AI execution speed #17