Closed chenzhiw closed 2 years ago
There is non-trivial overhead associated with accessing / unpacking locals()
and the ThermoAnalysis
object manipulation in the current primer3-py implementation.
In [5]: %prun [primer3.calcTm('GTAAAACGACGGCCAGT') for _ in range(100000)]
400004 function calls in 0.368 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
100000 0.126 0.000 0.126 0.000 bindings.py:57(_setThermoArgs)
100000 0.109 0.000 0.324 0.000 bindings.py:208(calcTm)
100000 0.050 0.000 0.050 0.000 {method 'calcTm' of 'primer3.thermoanalysis.ThermoAnalysis' objects}
1 0.043 0.043 0.367 0.367 <string>:1(<listcomp>)
100000 0.040 0.000 0.040 0.000 {built-in method builtins.locals}
1 0.001 0.001 0.368 0.368 <string>:1(<module>)
1 0.000 0.000 0.368 0.368 {built-in method builtins.exec}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
_setThermoArgs
accounts for 0.126 / 0.324 = 38%
of the calcTm
call time. So if you factor that out of your timing numbers, then the per call time of the binding is 3.09 µs * (1 - 0.38) = 1.92 µs
. (this is a little hand wavy but hopefully it helps explain the discrepancy)
Yeah, thank you. It helps me to understand it.
In [1]: from ctypes import CDLL, c_char_p, c_double, c_int
In [2]: import primer3
In [3]: oligotm_so = CDLL("./oligotm.so")
In [4]: oligotm = oligotm_so.oligotm
In [5]: oligotm.restype = c_double
In [6]: oligotm.argtypes = [c_char_p, c_double, c_double, c_double, c_double, c_int, c_int]
In [7]: %timeit primer3.calcTm('GTAAAACGACGGCCAGT') 3.09 µs ± 20.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [8]: %timeit primer3.wrappers.calcTm('GTAAAACGACGGCCAGT') 3.5 ms ± 96.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [9]: %timeit oligotm('GTAAAACGACGGCCAGT'.encode(), 50, 50, 0, 0.8, 1, 1) 2 µs ± 50.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [10]: %timeit oligotm(b'GTAAAACGACGGCCAGT', 50, 50, 0, 0.8, 1, 1) 1.93 µs ± 15.2 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [11]: primer3.calcTm('GTAAAACGACGGCCAGT') Out[11]: 49.16808228911765
In [12]: primer3.wrappers.calcTm('GTAAAACGACGGCCAGT') Out[12]: 49.168082
In [13]: oligotm(b'GTAAAACGACGGCCAGT', 50, 50, 0, 0.8, 1, 1) Out[13]: 49.16808228911765
Why is that?