issues
search
TencentARC
/
UMT
UMT is a unified and flexible framework which can handle different input modality combinations, and output video moment retrieval and/or highlight detection results.
Other
193
stars
19
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Will the model automatically truncate the video if the video duration is greater than 150 seconds?
#54
ffiioonnaa
opened
5 months ago
0
Seed for Youtube Highlights Categories
#53
SalmaMohamedElsayed
closed
6 months ago
1
Model Instability
#52
SalmaMohamedElsayed
closed
7 months ago
3
about model
#51
youngprogrammerBee
closed
7 months ago
1
The forward method of UMT
#50
czzhao-sjtu
closed
8 months ago
1
My dataset
#49
anas2908
closed
8 months ago
1
Audio feature extraction
#48
SalmaMohamedElsayed
closed
8 months ago
1
The Checkpoint file requirment
#47
EasonXiao-888
closed
1 year ago
0
qvhighlights/umt_base_pretrain_100e_asr.py
#46
c1d2y3
closed
12 months ago
1
result of QVHighlights val set
#45
EasonXiao-888
closed
1 year ago
2
Inference code
#44
hadesfgh
closed
1 year ago
1
Error. TypeError: '>=' not supported between instances of 'DataContainer' and 'int'
#43
GuangtaoLyu
closed
1 year ago
2
Inference mode
#42
Rj-batista
closed
1 year ago
6
Any idea of model's general highlight effectiveness
#41
JW-xiilab
closed
1 year ago
2
Audio feature extraction
#40
GYGWG
closed
8 months ago
1
Model applicability
#39
gracikk-ds
closed
1 year ago
1
Query feature in TVSum highlight detection
#38
GYGWG
closed
1 year ago
1
TVSum training problem
#37
GYGWG
closed
1 year ago
2
Attention map visualization
#36
G-Apple1
closed
1 year ago
2
音频特征提取部分的代码
#35
luyanger1799
closed
1 year ago
1
Model Computation Amount (FLOPs) and Number of Parameters (Params)
#34
Yangaiei
closed
2 years ago
1
Pretraining Problem
#33
Lonicer
closed
2 years ago
3
What is the horizontal coordinate of Figure 4 in the paper? What does it represent?
#32
G-Apple1
closed
2 years ago
3
results visualized
#31
Yangaiei
closed
2 years ago
2
Text embedding on charadesSTA dataset and some minor questions
#30
hsi1032
closed
2 years ago
5
Misalignment between video and audio for QVhighlight
#29
wjun0830
closed
2 years ago
2
model test
#28
Yangaiei
closed
2 years ago
14
save epoch problems
#27
xiaohuihui-com
closed
2 years ago
1
retrieve a video in real time
#26
Lynneyyq
closed
2 years ago
3
automatic learning rate adjustment
#25
Yangaiei
closed
2 years ago
2
How do I use the trained models available in model zoo
#24
AliButtarRB
closed
2 years ago
1
validate
#23
tangxiaochu123230
closed
2 years ago
1
audio feature extraction
#22
Yangaiei
closed
2 years ago
1
metric methods
#21
oomq
closed
2 years ago
6
Hello, questions about text feature extraction。
#20
Yangaiei
closed
2 years ago
5
Can you provide a demo about running predictions on my own videos and queries
#19
hpppppp8
closed
2 years ago
2
How to align the audio and video at the clip level
#18
Lynneyyq
closed
2 years ago
8
how to align the audio feature and video feature?
#17
Xuguozi
closed
2 years ago
7
How do I make my dataset
#16
Yangaiei
closed
2 years ago
3
RuntimeError: CUDA error: no kernel image is available for execution on the device
#15
hpppppp8
closed
2 years ago
1
how to visulize the results in your paper
#14
wenhaoHou
closed
2 years ago
1
feature exaction
#13
Xuguozi
closed
2 years ago
1
Can you provide the original video data? Especially YouTube Highlights.
#12
mxtx0509
closed
2 years ago
5
How can I annotate my own dataset?
#11
Xuguozi
closed
2 years ago
1
bug?? if (num_gt := sum(label)) == 0:
#10
Xuguozi
closed
2 years ago
7
How to prepare the data
#9
Lynneyyq
closed
2 years ago
1
.json annotation
#8
Lynneyyq
closed
2 years ago
1
feature extraction (i3d and optical flow)
#7
Lvqin001
closed
2 years ago
16
extract audio features
#6
G-Apple1
closed
2 years ago
1
How to extract video features
#5
Yangaiei
closed
2 years ago
1
Next