How run run multiple models on a single video

Qengineering / Face-Recognition-with-Mask-Jetson-Nano

Recognize 2000+ faces on your Jetson Nano with additional mask detection, auto-fill and anti-spoofing

https://qengineering.eu/deep-learning-examples-on-raspberry-32-64-os.html

BSD 3-Clause "New" or "Revised" License

35 stars 6 forks source link

How run run multiple models on a single video #15

Open rsingh2083 opened 3 years ago

rsingh2083 commented 3 years ago

Hi Sir,

You have 4 models ( RetinaFace, Mask, Arcface & AntiSpoofing ) running over single video.

1.] What exactly is this method of running multiple models known as?

2.] Is there any documentation/book on how to run multiple models at once ?

3.] Why Im asking is because I have a model for License Plate detection and another model for License Plate Recognition, and I want to combine both the models in same way as you have done. (First model tries to locate license plate in video -> If found -> Second Model reads the number plate)

Qengineering commented 3 years ago

It looks more complicated than it actually is. It's a chain, as you describe yourself in the case of the license plates. The only concern is your input and output. The video should be resized according to the input size of the first model. The output of this model dictates the locations. These must be transposed to the original video format for them to fit properly. Tip: Run your code only on the first model to see if license plate locations are properly detected. The second model is treated in the same way. Resize the license plate rectangle to the input of the OCR and get your output. It's all standard practice. Consider the time it takes to process a single frame. Most DNNs are not well suited to the limited power of a Jetson Nano.

rsingh2083 commented 3 years ago

Thanks Sir. Is there any name for this entire procedure ? So that I can search and learn more about it all

Qengineering commented 3 years ago

Not really. The term 'cascading' comes in mind. But searching Google for this term hints only at merging several DNN's into on, or something similar. I just divide your problem into two separate one's, just lie you did already. Solve them one by one.

rsingh2083 commented 2 years ago

Is it this CS330 (Multi-task & meta learning) https://www.youtube.com/watch?v=0rZtSwNOTQo&list=PLoROMvodv4rMC6zfYmnD7UG3LVvwaITY5

Qengineering commented 2 years ago

Sorry, no. You are thinking too complex, too difficult on this. Multitasking and meta-learning here refers to learning a task that is applicable in different situations. So once the robot 'knows' how to screw a cap on a bottle, it can do the same for all other situations where a screw cap occurs. In our case of facial recognition, it's more of a concatenation of different tasks. Just like with 'normal' programming in which you have to perform operation a and operation b before c can be done. So here too: first find a face in the whole picture, then cut it out, then recognize it. Or, first find the number plate from the picture, then cut it out and recognize the characters.

rsingh2083 commented 2 years ago

Ok Sir. I think I need to study your FR code thoroughly, even that should help . Can you also post a tutorial on your qengineering website about this procedure , this is much needed, no-ones even talking about it much.