Closed elizlee closed 2 years ago
Comparing the call that takes the longest:
ncalls tottime percall cumtime percall filename:lineno(function)
CPU, padding=max_length=256
43853 2052.314 0.047 2052.314 0.047 {method 'matmul' of 'torch._C._TensorBase' objects}
CPU, padding=True
43853 524.376 0.012 524.376 0.012 {method 'matmul' of 'torch._C._TensorBase' objects}
Further reduces wikidata linking time by making the max input length in a batch the same as the longest input in that batch (instead of
256
). Also makesmax_batch_size
configurable.