Open Kumaken opened 3 years ago
Interesting observation. I didn't actually test this methodically. I was looking for avenues to improve the performance of the system and running the updates in parallel seemed like a good way to do so. But I encountered an issue. I wanted to use process-based parallelism (i.e multiple CPUs) but this didn't work because Joblib could not pickle the tracker object. I tried to do some serialization but this didn't work so I decided to use threads because I didn't want to throw out all the changes I'd made. 😅
So it's not a problem with your hardware. Joblib is only using 1/8 of your CPU cores, if I understand how it works correctly. If you'd like to take a stab at fixing/improving the code, I'm happy to support you. 🙂
@Kumaken do you have a gist with the change you've done?
@nicholaskajoh is there a quick way to improve that 1/8 speed?
First of all, I want to say that this is a wonderful project and I had a blast experimenting with it. Thank you very much.
However, one thing I noticed is that I was able to get a higher FPS (YOLO detector) when I switched out Joblib's Parallel code with just an ordinary for-loop at ObjectCounter.py (the line where you update_blob_tracker for every blob in blobs_list).
Is this a problem on my hardware (Windows, 8 core CPU, in case it is relevant)? Or is the overhead cost of parallelism outweighs the cost to update blobs' trackers?
Thank you in advance.