OpenGVLab / VideoMamba

VideoMamba: State Space Model for Efficient Video Understanding
https://arxiv.org/abs/2403.06977
Apache License 2.0
660 stars 47 forks source link

Disabling use of Slurm #40

Open matbee-eth opened 2 months ago

matbee-eth commented 2 months ago

I'm not sure what slurm is, but based on how it doesnt work even after installing, I'd like to avoid its existence.

Any tips for removing it as a dependency?

Andy1621 commented 2 months ago

Which task do you want to use? For most of the tasks, you can simply remove srun in the scripts and then run the training.

pravn commented 2 months ago

Hi - I got around this issue by using torchrun Here is my run script - we change nproc-per-node to match the number of gpus we want to use (assuming single node)

torchrun --nnodes=1 --nproc-per-node=1 \ run_class_finetuning.py \ --model videomamba_middle \ --finetune /home/ubuntu/VideoMamba/videomamba/video_sm/exp/k400/videomamba_middle_mask/videomamba_middle_mask_ft_f8_res224/checkpoint-latest.pth \ --data_path ${DATA_PATH} \ --prefix ${PREFIX} \ --data_set 'Kinetics_sparse' \ --split ',' \ --nb_classes 400 \ --log_dir ${OUTPUT_DIR} \ --output_dir ${OUTPUT_DIR} \ --batch_size 8 \ --num_sample 2 \ --input_size 224 \ --short_side_size 224 \ --save_ckpt_freq 100 \ --num_frames 8 \ --num_workers 12 \ --warmup_epochs 5 \ --tubelet_size 1 \ --epochs 45 \ --lr 1e-4 \ --layer_decay 0.8 \ --drop_path 0.4 \ --opt adamw \ --opt_betas 0.9 0.999 \ --weight_decay 0.05 \ --test_num_segment 4 \ --test_num_crop 3 \ --dist_eval \ --test_best \ --bf16