FangyunWei / SLRT

259 stars 56 forks source link

Training with custom data, what should I do? #56

Open ToanICV opened 7 months ago

ToanICV commented 7 months ago

Hi. Thank you for your works and open it. I'm trying to train with custom dataset from scratch. I have videos, glosses and texts as well. I plan to train with SingleStream first to have a baseline. I have trained with G2T and result is good (BLEU4 score: 43.76, ROUGE: 70.23). Now I'm training S2G but the result is not good (WER always ~ 95-100 and loss ~60). I believe that I missing somethings. So may you tell me a road map or something familiar to train with custom dataset? Thank you so much.

ToanICV commented 7 months ago

I have only 2 x GPUs P40 24GB. Train command:

python3 -m torch.distributed.launch --nproc_per_node 2 --use_env training.py --config experiments/configs/SingleStream/vsl-edu_s2g.yaml

This is my configs:

task: S2G data: input_data: videos #features, gloss zip_file: /root/working/vsl_edu_v2_15.zip input_streams:

  • rgb
    dataset_name: vsl-edu level: word #word or char txt_lowercase: true max_sent_length: 400 train: /root/input/vsl-edu-labels/myVSL_p2_15.train dev: /root/input/vsl-edu-labels/myVSL_p2_15.dev test: /root/input/vsl-edu-labels/myVSL_p2_15.test transform_cfg: img_size: 224 aug_hflip: false color_jitter: false bottom_area: 0.9 csl_cut: False csl_resize:
  • 224
  • 224 center_crop: false center_crop_size: 270 randomcrop_threshold: 1 aspect_ratio_min: 0.75 aspect_ratio_max: 1.3 temporal_augmentation: tmin: 0.5 tmax: 1.5 testing: cfg: recognition: beam_size: 5 training: random_seed: 44 overwrite: False model_dir: /root/working/TwoStreamNetworkVi/TwoStreamNetwork/experiments/outputs/SingleStream/vsl-edu_s2g shuffle: True num_workers: 4 batch_size: 1 total_epoch: 100 keep_last_ckpts: 1 validation: unit: epoch freq: 1 cfg: recognition: beam_size: 1 optimization: optimizer: Adam learning_rate: default: 5.0e-2 weight_decay: 0.001 betas:
  • 0.9
  • 0.998 scheduler: cosineannealing t_max: 40 model: RecognitionNetwork: GlossTokenizer: gloss2id_file: /root/input/vsl-edu-labels/gloss2ids_vsl_p2_15.pkl s3d: pretrained_ckpt: pretrained_models/s3ds_glosscls_ckpt use_block: 4 freeze_block: 1 visual_head: input_size: 832 hidden_size: 512 ff_size: 2048 pe: True ff_kernelsize:
  • 3
  • 3
HebaRaslan commented 5 months ago

peace be upon you how you custom data to G2T task ...can you tell me the roadmap for it

len2618187 commented 5 months ago

Have you found a solution?

2000ZRL commented 5 months ago

Well, it is difficult to debug given just the configs... Basically, to train on your custom dataset, you only need to use your own split files (xxx.train, xxx.dev, xxx.test), and modify the dataloader if necessary. For the S2G task, you only need to make sure the video-gloss pairs are correct.