TencentARC / UMT

UMT is a unified and flexible framework which can handle different input modality combinations, and output video moment retrieval and/or highlight detection results.
Other
193 stars 19 forks source link