Thank you very much to share your great work!
I tried to reproduce the video recognition results but get very low accuracy.
Can you give me some advices if I missed something? or kindly provide a script which can get Acc in the Table?
I tested the model based on this script: jepa/evals/video_classification_frozen/eval.py
Configs:vitl16_ssv2_16x2x3.yaml
nodes: 8
tasks_per_node: 8
tag: ssv2-16x2x3
eval_name: video_classification_frozen
resume_checkpoint: false
data:
dataset_train: xx/ssv2_train.csv
dataset_val: xx/ssv2_val.csv
dataset_type: VideoDataset
num_classes: 174
frames_per_clip: 16
num_segments: 1 #2
num_views_per_segment: 3
frame_step: 4
optimization:
attend_across_segments: true
num_epochs: 20
resolution: 224
batch_size: 16 #4
weight_decay: 0.01
lr: 0.001
start_lr: 0.001
final_lr: 0.0
warmup: 0.
use_bfloat16: true
pretrain:
model_name: vit_large
checkpoint_key: target_encoder
clip_duration: null
frames_per_clip: 16
tubelet_size: 2
uniform_power: true
use_silu: false
tight_silu: false
use_sdpa: true
patch_size: 16
folder: xx/JEPA
checkpoint: vitl16.pth
write_tag: jepa
Thank you very much to share your great work! I tried to reproduce the video recognition results but get very low accuracy. Can you give me some advices if I missed something? or kindly provide a script which can get Acc in the Table? I tested the model based on this script: jepa/evals/video_classification_frozen/eval.py Configs:vitl16_ssv2_16x2x3.yaml nodes: 8 tasks_per_node: 8 tag: ssv2-16x2x3 eval_name: video_classification_frozen resume_checkpoint: false data: dataset_train: xx/ssv2_train.csv dataset_val: xx/ssv2_val.csv dataset_type: VideoDataset num_classes: 174 frames_per_clip: 16 num_segments: 1 #2 num_views_per_segment: 3 frame_step: 4 optimization: attend_across_segments: true num_epochs: 20 resolution: 224 batch_size: 16 #4 weight_decay: 0.01 lr: 0.001 start_lr: 0.001 final_lr: 0.0 warmup: 0. use_bfloat16: true pretrain: model_name: vit_large checkpoint_key: target_encoder clip_duration: null frames_per_clip: 16 tubelet_size: 2 uniform_power: true use_silu: false tight_silu: false use_sdpa: true patch_size: 16 folder: xx/JEPA checkpoint: vitl16.pth write_tag: jepa