isi-vista / cdse-covid

Claim detection & semantic extraction (Covid-19 domain)
0 stars 0 forks source link

Faster linking #202

Closed elizlee closed 2 years ago

elizlee commented 2 years ago

Further reduces wikidata linking time by making the max input length in a batch the same as the longest input in that batch (instead of 256). Also makes max_batch_size configurable.

elizlee commented 2 years ago

Comparing the call that takes the longest:

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
CPU, padding=max_length=256
43853 2052.314    0.047 2052.314    0.047 {method 'matmul' of 'torch._C._TensorBase' objects}
CPU, padding=True
43853  524.376    0.012  524.376    0.012 {method 'matmul' of 'torch._C._TensorBase' objects}