Use np.rint instead of np.round since we are rounding to integer anyway. (~5.54s to 4.16s)
Use points.reshape directly, since all points in PKD are already float. Use np.ndarray * tuple to do column size calculations to avoid inplace modification or having to create a temporary array. (~5.48s to 3.68s)
Use a mask to filter out detections so the original numpy array can used instead of a list with del. Enables computing iou with vectorization instead row-wise calculations (~23.5s to 18.8s)
Use np.maximum and np.minimum directly without np.c_ (~18.8s to 7.28s)
Multiply directly instread of using np.prod to avoid unnecessary reduce operation (~7.28s to 6.02s)
draw.legend
Perform addWeighted operation on a portion of the input frame, avoid copying the entire frame (~6.57s to 4.28s)
model.posenet (unable to visualize with snakeviz after applying decorator)
_clip_to_indices use np.rint
use np.linalg.norm instead of sum squared
Apply numba.jit decorator
model.efficientdet
Performs preprocessing calculations in one line (~43.5s to 26.7s)
Reimplement np.pad with np.zeros to use numba.jit (~26.7s to 10.8s)
Dependencies
Add numba >= 0.56.4 dependency
Bump numpy to 1.18.5 for linting requirements
Remove upper limit for tensorflow (Offending code in efficientdet already removed, tested to be working up to tensorflow==2.11.0)
Relax torch and torchvision version pinning (tested to be working up to torch==1.13.0 and torchvision==0.14.0)
Refactors for speedup
Contains various algorithm refactor/changes to improve execution speed, numbers provided are from
cProfile + snakeviz
where possible.dabble.zone_count
.buffer(1)
fromcontains()
to__init__()
(~12.8s to 4.47s)Point
frombtm_midpoint
once (~8.86s to 5.48s)draw.tag
,draw/utils/general.py:project_points_onto_original_image
np.rint
instead ofnp.round
since we are rounding to integer anyway. (~5.54s to 4.16s)points.reshape
directly, since all points in PKD are already float. Use np.ndarray * tuple to do column size calculations to avoid inplace modification or having to create a temporary array. (~5.48s to 3.68s)dabble.tracking
,dabble/trackingv1/tracking_files/iou_tracker.py
list
withdel
. Enables computing iou with vectorization instead row-wise calculations (~23.5s to 18.8s)np.maximum
andnp.minimum
directly withoutnp.c_
(~18.8s to 7.28s)np.prod
to avoid unnecessary reduce operation (~7.28s to 6.02s)draw.legend
addWeighted
operation on a portion of the input frame, avoid copying the entire frame (~6.57s to 4.28s)model.posenet
(unable to visualize withsnakeviz
after applying decorator)_clip_to_indices
usenp.rint
np.linalg.norm
instead of sum squarednumba.jit
decoratormodel.efficientdet
numba.jit
(~26.7s to 10.8s)Dependencies
numba >= 0.56.4
dependencynumpy
to1.18.5
for linting requirementstensorflow
(Offending code inefficientdet
already removed, tested to be working up totensorflow==2.11.0
)torch
andtorchvision
version pinning (tested to be working up totorch==1.13.0
andtorchvision==0.14.0
)