finetune - Githubissues

cs19469 commented 1 year ago

hello, when i want to finetune in Konvid, the result is crazy bad(plcc just 0.378), the following is .yml files

load_path: ./pretrained_weights/FAST_VQA_B_1_4.pth test_load_path:

I have only modify the path of data

teowu commented 1 year ago

Hi, May I know whether you are using the split_train.py? This might happen because the pre-trained weights are not properly loaded, but is not likely happen in the current version of split_train.py (we met the bug several version ago and removed it).

cs19469 commented 1 year ago

Thank you for your reply！ yeah, the file i'm using is the split_train.py, the pre_trained weights is FAST_VQA_B_1_4.pth, Is there something wrong with what I did？maybe i should try the new version.

cs19469 commented 1 year ago

Hello, this is another question I would like to ask, is the .pkl file under the result folder a model saved by finetune?

teowu commented 1 year ago

Hello, this is another question I would like to ask, is the .pkl file under the result folder a model saved by finetune? Not actually..the results should be saved under as ".pth" in the ./pretrained_weights. the file i'm using is the split_train.py, the pre_trained weights is FAST_VQA_B_1_4.pth Let me check with your settings for fine-tuning.

cs19469 commented 1 year ago

just like the following: name: FAST-B_1*4_To_YouTubeUGC num_epochs: 20 l_num_epochs: 10 warmup_epochs: 2.5 ema: true save_model: true batch_size: 8 num_workers: 0 split_seed: 42

wandb: project_name: VQA_Experiments_2022

data: train: type: FusionDataset args: phase: train anno_file: ./examplar_data_labels/YouTubeUGC/name1.txt data_prefix: /data/ch/UGC/original_videos_h264/ sample_types: fragments: fragments_h: 7 fragments_w: 7 fsize_h: 32 fsize_w: 32 aligned: 32 clip_len: 32 frame_interval: 2 num_clips: 1

val:
    type: FusionDataset
    args:
        phase: test
        anno_file: ./examplar_data_labels/YouTubeUGC/name1.txt
        data_prefix: /data/ch/UGC/original_videos_h264/
        sample_types:
            #resize:
            #    size_h: 224
            #    size_w: 224
            fragments:
                fragments_h: 7
                fragments_w: 7
                fsize_h: 32
                fsize_w: 32
                aligned: 32
                clip_len: 32
                frame_interval: 2
                num_clips: 4

model: type: DiViDeAddEvaluator args: backbone: fragments: checkpoint: false pretrained: backbone_size: swin_tiny_grpb backbone_preserve_keys: fragments divide_head: false vqa_head: in_channels: 768 hidden_channels: 64

optimizer: lr: !!float 1e-3 backbone_lr_mult: !!float 1e-1 wd: 0.05

load_path: ./pretrained_weights/FAST_VQA_B_1_4.pth test_load_path:

cs19469 commented 1 year ago

Sorry to bother again, I‘d like to ask if you have the fine-tuned model of othe dataset(Konvid, UGC, etc)?

DAVID-Hown commented 1 year ago

Can you communicate with me? Email: howndawei@gmail.com

cs19469 commented 1 year ago

Can you communicate with me? Email: howndawei@gmail.com

yeah, of course.

DAVID-Hown commented 1 year ago

Hello, I would like to ask how the mean and variance provided in the test code are calculated, and are they closely related to the data used during model training？

mean_stds = { "FasterVQA": (0.14759505, 0.03613452), "FasterVQA-MS": (0.15218826, 0.03230298), "FasterVQA-MT": (0.14699507, 0.036453716), "FAST-VQA": (-0.110198185, 0.04178565), "FAST-VQA-M": (0.023889644, 0.030781006), } def sigmoid_rescale(score, model="FasterVQA"): mean, std = mean_stds[model] x = (score - mean) / std print(f"Inferring with model [{model}]:") score = 1 / (1 + np.exp(-x)) return score

teowu commented 1 year ago

Hi David, you may refer to the following code during set-wise inference with given variant (e.g. FasterVQA):

def rescale(pr, gt=None):

    print("mean", np.mean(pr), "std", np.std(pr))
    pr = (pr - np.mean(pr)) / np.std(pr)
    return pr

And then hard-code them based on your need.

Hope this helps.

VQAssessment / FAST-VQA-and-FasterVQA

finetune #31