Closed PeterrHH closed 10 months ago
Is there anyway we could do inference on video as input and output multiple recognized gloss/word?
Normally, you can do inference on the Dev and Test sets of a dataset as suggested by the instructions of 'Inference' section of the README.md.
Is there anyway we could do inference on video as input and output multiple recognized gloss/word?