Closed Caspeerrr closed 2 months ago
Hi @Caspeerrr , we release AdaTAD's results on Multi-THUMOS at here. The performance is the following.
Backbone | Frames | Img Size | mAP@0.2 | mAP@0.5 | mAP@0.7 | ave. mAP (0.1:0.9:0.1) |
---|---|---|---|---|---|---|
VideoMAE-S | 768 | 160 | 61.34 | 46.74 | 26.88 | 40.77 |
VideoMAE-B | 768 | 160 | 63.90 | 48.74 | 28.72 | 42.76 |
VideoMAE-L | 768 | 160 | 66.06 | 51.80 | 31.73 | 45.15 |
VideoMAE-H | 768 | 160 | 67.20 | 52.99 | 32.70 | 46.02 |
VideoMAEV2-g | 768 | 160 | 68.23 | 53.87 | 33.03 | 46.74 |
VideoMAEV2-g | 1536 | 224 | 71.11 | 55.83 | 34.86 | 48.73 |
Congrats on the great work! I was wondering if Adatad supports multi-label temporal action detection (e.g. for multi-thumos)? Thanks!