The current matching process runs slowly because Python's for loop is inefficient. Thus, we need to consider how to speed up the matching process to deal with the bigger dataset.
Tools
Using the multithreaded processing package to speed up (e.g. multiprocess)
Partly rewrite the functions to reduce the use of for loops.
As discussed, we'll revisit optimisation after updating to an alternative matching approach in #13 and continue to work with a sample of 15000 of the SPC population for now.
The current matching process runs slowly because Python's
for
loop is inefficient. Thus, we need to consider how to speed up the matching process to deal with the bigger dataset.Tools
for
loops.