Joblib Parallel is actually slowing down tracker update

Kumaken commented 3 years ago

First of all, I want to say that this is a wonderful project and I had a blast experimenting with it. Thank you very much.

However, one thing I noticed is that I was able to get a higher FPS (YOLO detector) when I switched out Joblib's Parallel code with just an ordinary for-loop at ObjectCounter.py (the line where you update_blob_tracker for every blob in blobs_list).

Is this a problem on my hardware (Windows, 8 core CPU, in case it is relevant)? Or is the overhead cost of parallelism outweighs the cost to update blobs' trackers?

Thank you in advance.

nicholaskajoh commented 3 years ago

Interesting observation. I didn't actually test this methodically. I was looking for avenues to improve the performance of the system and running the updates in parallel seemed like a good way to do so. But I encountered an issue. I wanted to use process-based parallelism (i.e multiple CPUs) but this didn't work because Joblib could not pickle the tracker object. I tried to do some serialization but this didn't work so I decided to use threads because I didn't want to throw out all the changes I'd made. 😅

So it's not a problem with your hardware. Joblib is only using 1/8 of your CPU cores, if I understand how it works correctly. If you'd like to take a stab at fixing/improving the code, I'm happy to support you. 🙂

pentium10 commented 2 years ago

@Kumaken do you have a gist with the change you've done?

@nicholaskajoh is there a quick way to improve that 1/8 speed?

nicholaskajoh / ivy

Joblib Parallel is actually slowing down tracker update #63