Closed JamesCarr1 closed 1 year ago
On initial profile, get the current runtimes:
ncalls tottime percall cumtime percall filename:lineno(function)
1660/1 0.023 0.000 3084.970 3084.970 {built-in method builtins.exec}
1 0.006 0.006 3084.970 3084.970 data_generator.py:1(
Appears to be dominated by list comp calls, assuming this is .append calls, so make node addition more efficient.
Following improved MoveTree code:
ncalls tottime percall cumtime percall filename:lineno(function) 1660/1 0.023 0.000 3080.024 3080.024 {built-in method builtins.exec} 1 0.005 0.005 3080.024 3080.024 data_generator.py:1(module) 1 0.000 0.000 3079.067 3079.067 data_generator.py:160(choose_move) 9323/1 0.535 0.000 3073.221 3073.221 data_generator.py:100(get_best_evals) 9323 0.300 0.000 3029.279 0.325 data_generator.py:140(listcomp) 377076 1600.541 0.004 2880.532 0.008 {built-in method builtins.max} 26051947997 1346.191 0.000 1346.191 0.000 data_generator.py:140(lambda) 357644 82.252 0.000 148.454 0.000 {built-in method builtins.min}
Main time sink appears to be line 140. Firstly, let's try to improve map() vs list comprehensions.
map() should be used for predefined functions, otherwise use a list.
Significantly improved execution time by replacing lambda function with one min_max_eval on ~line140 of data_generator.py.
ncalls tottime percall cumtime percall filename:lineno(function)
1660/1 0.020 0.000 49.574 49.574 {built-in method builtins.exec}
1 0.005 0.005 49.574 49.574 data_generator.py:1(
Computation currently appears to be dominated by the conversion from FEN to a matrix. Adjusting the current code to a number of if statements does not appear to improve the efficiency (actually decreases it). Could consider writing in a more efficient language?
as_tensor conversion has been significantly optimised, reducing the as_tensor calculation time from 26s down to 9s:
ncalls tottime percall cumtime percall filename:lineno(function)
1660/1 0.026 0.000 32.195 32.195 {built-in method builtins.exec}
1 0.005 0.005 32.195 32.195 data_generator.py:1(
Would now be ideal to improve the forward() call in the model.
forward() is currently (and unnecessarily) converting a map to a list to a tensor and using torch.sum.
Try just using a list and list.sum():
ncalls tottime percall cumtime percall filename:lineno(function)
1660/1 0.021 0.000 29.584 29.584 {built-in method builtins.exec}
1 0.005 0.005 29.584 29.584 data_generator.py:1(
This has improved the performance by about 3 seconds.
Now going to compare map() vs. a list
ncalls tottime percall cumtime percall filename:lineno(function)
1660/1 0.020 0.000 29.758 29.758 {built-in method builtins.exec}
1 0.005 0.005 29.758 29.758 data_generator.py:1(
This is slightly slower using a for loop. Try a list comprehension.
With list comprehension:
ncalls tottime percall cumtime percall filename:lineno(function)
1660/1 0.023 0.000 28.593 28.593 {built-in method builtins.exec}
1 0.005 0.005 28.593 28.593 data_generator.py:1(
Which is appreciably faster!
Finally, try just summing in a for loop:
ncalls tottime percall cumtime percall filename:lineno(function)
1660/1 0.024 0.000 28.981 28.981 {built-in method builtins.exec}
1 0.005 0.005 28.981 28.981 data_generator.py:1(
Slower, so use the list comprehension.
Actually seems to be a fluke - summing in a for loop is faster.
Currently dominated by forward() and as_tensor(), so focus on those.
Actually having a quick look at find_possible_moves
. This is dominated by the lines:
if self.current_position.outcome() is None: # If game is over, don't try to find any more moves
self.find_possible_moves(depth - 0.5, new_tree) # find valid moves. One move is 'half a depth' so subtract 0.5
Of this, naively removing the if statement approximately halves the computation time (2.753s vs. 5.845).
Turns out the if
statement is totally unnecessary. self.current_position.legal_moves
will just be an empty list if the game is over, so nothing will be added.
Closing this issue for now - am going to look at alternative optimisation methods.
Move evaluation becomes unreasonably long above a depth of 1. This is partly due to a lack of tree pruning but could also be affected by the speed of commonly used functions. Find what is causing the most slowdown and try to optimise it. Possible culprits: