Open ToanICV opened 7 months ago
I have only 2 x GPUs P40 24GB. Train command:
python3 -m torch.distributed.launch --nproc_per_node 2 --use_env training.py --config experiments/configs/SingleStream/vsl-edu_s2g.yaml
This is my configs:
task: S2G data: input_data: videos #features, gloss zip_file: /root/working/vsl_edu_v2_15.zip input_streams:
- rgb
dataset_name: vsl-edu level: word #word or char txt_lowercase: true max_sent_length: 400 train: /root/input/vsl-edu-labels/myVSL_p2_15.train dev: /root/input/vsl-edu-labels/myVSL_p2_15.dev test: /root/input/vsl-edu-labels/myVSL_p2_15.test transform_cfg: img_size: 224 aug_hflip: false color_jitter: false bottom_area: 0.9 csl_cut: False csl_resize:- 224
- 224 center_crop: false center_crop_size: 270 randomcrop_threshold: 1 aspect_ratio_min: 0.75 aspect_ratio_max: 1.3 temporal_augmentation: tmin: 0.5 tmax: 1.5 testing: cfg: recognition: beam_size: 5 training: random_seed: 44 overwrite: False model_dir: /root/working/TwoStreamNetworkVi/TwoStreamNetwork/experiments/outputs/SingleStream/vsl-edu_s2g shuffle: True num_workers: 4 batch_size: 1 total_epoch: 100 keep_last_ckpts: 1 validation: unit: epoch freq: 1 cfg: recognition: beam_size: 1 optimization: optimizer: Adam learning_rate: default: 5.0e-2 weight_decay: 0.001 betas:
- 0.9
- 0.998 scheduler: cosineannealing t_max: 40 model: RecognitionNetwork: GlossTokenizer: gloss2id_file: /root/input/vsl-edu-labels/gloss2ids_vsl_p2_15.pkl s3d: pretrained_ckpt: pretrained_models/s3ds_glosscls_ckpt use_block: 4 freeze_block: 1 visual_head: input_size: 832 hidden_size: 512 ff_size: 2048 pe: True ff_kernelsize:
- 3
- 3
peace be upon you how you custom data to G2T task ...can you tell me the roadmap for it
Have you found a solution?
Well, it is difficult to debug given just the configs... Basically, to train on your custom dataset, you only need to use your own split files (xxx.train, xxx.dev, xxx.test), and modify the dataloader if necessary. For the S2G task, you only need to make sure the video-gloss pairs are correct.
Hi. Thank you for your works and open it. I'm trying to train with custom dataset from scratch. I have videos, glosses and texts as well. I plan to train with SingleStream first to have a baseline. I have trained with G2T and result is good (BLEU4 score: 43.76, ROUGE: 70.23). Now I'm training S2G but the result is not good (WER always ~ 95-100 and loss ~60). I believe that I missing somethings. So may you tell me a road map or something familiar to train with custom dataset? Thank you so much.