I want to input a video separately for inference to obtain results. How should I proceed, and is it possible to write a standalone script for inference.For example, input a Chinese sign language video and provide the corresponding predicted text. Urgently seeking.
I want to input a video separately for inference to obtain results. How should I proceed, and is it possible to write a standalone script for inference.For example, input a Chinese sign language video and provide the corresponding predicted text. Urgently seeking.