Closed YoungJT-7274 closed 1 year ago
You can refer to the code of ACMNet (https://github.com/ispc-lab/ACM-Net) and transplant our method on their codebase. More details please refer to https://github.com/LeonHLJ/RSKP/issues/3
Gaufuse introduces a lot of new parameters, and I've noticed that some of GauFuse's parameters seem to differ from RSKP's. Can you give us more details about Gaufuse's implementation on ActivityNet dataset?
You can read carefully about the RSKP, and then plugin our method. New parameters are actually only a few, the following config can be considered.
import argparse
parser = argparse.ArgumentParser(description='WSTAL')
parser.add_argument('--gpus', type=int, default=[0], nargs='+', help='used gpu') parser.add_argument('--run-type', type=int, default=0, help='train (0) or evaluate (1)') parser.add_argument('--model-id', type=str, default="rskp_baseline_anet", help='model id for saving model')
parser.add_argument('--pretrained', default= False,action='store_true', help='is pretrained model') parser.add_argument('--load-epoch', type=int, default=None, help='epoch of loaded model') parser.add_argument('--use_new_predictor', type=bool, default=False, help='whether to use the new prediction module')
parser.add_argument('--save-interval', type=int, default=5, help='interval for storing model')
parser.add_argument('--dataset-root', default='./data/', help='dataset root path') parser.add_argument('--dataset-name', default='Anet', help='dataset to train on') parser.add_argument('--sample-segments-num', default=75, help='video number') parser.add_argument('--segment-frames-num', default=16, help='video number') parser.add_argument('--frames-per-sec', default=25, help='video number') parser.add_argument('--test-upgrade-scale', default=20, help='video number') parser.add_argument('--test-gt-file-path', default='./data/Anet/gt.json', help='video number')
parser.add_argument('--feature-type', type=str, default='I3D', help='type of feature to be used (default: I3D)') parser.add_argument('--inp-feat-num', type=int, default=2048, help='size of input feature (default: 2048)') parser.add_argument('--out-feat-num', type=int, default=2048, help='size of output feature (default: 2048)') parser.add_argument('--class-num', type=int, default=200, help='number of classes (default: 20)') parser.add_argument('--scale-factor', type=float, default=20.0, help='temperature factors')
parser.add_argument('--T', type=float, default=0.2, help='number of head') parser.add_argument('--w', type=float, default=0.5, help='number of head')
parser.add_argument('--batch-size', type=int, default=128, help='number of instances in a batch of data (default: 10)') parser.add_argument('--lr', type=float, default=0.00005, help='learning rate (default: 0.0001)') parser.add_argument('--lr-decay', type=float, default=0.8, help='learning rate decay(default: 0.0001)') parser.add_argument('--weight-decay', type=float, default=0.0005, help='weight deacy (default: 0.001)') parser.add_argument('--dropout', default=0.6, help='dropout value (default: 0.5)') parser.add_argument('--seed', type=int, default=2, help='random seed (default: 1)') parser.add_argument('--max-epoch', type=int, default=250, help='maximum iteration to train (default: 50000)')
parser.add_argument('--mu-num', type=int, default=8, help='number of Gaussians') parser.add_argument('--mu-queue-len', type=int, default=5, help='number of slots of each class of memory bank') parser.add_argument('--em-iter', type=int, default=2, help='number of EM iteration')
parser.add_argument('--warmup-epoch', default=100, help='epoch starting to use the inter-video branch')
parser.add_argument('--class-threshold', type=float, default=0.16, help='class threshold for rejection') parser.add_argument('--start-threshold', type=float, default=0.001, help='start threshold for action localization') parser.add_argument('--end-threshold', type=float, default=0.04, help='end threshold for action localization') parser.add_argument('--threshold-interval', type=float, default=0.002, help='threshold interval for action localization')
parser.add_argument('--decay-type', type=int, default=1, help='weight decay type (0 for None, 1 for step decay, 2 for cosine decay)') parser.add_argument('--changeLR_list', type=int, default=[80,1000], help='change lr step') parser.add_argument('--use_mem', type=int, default=1, help='0 not use 1 use') parser.add_argument('--use_foreloss', type=int, default=1, help='0 not use 1 use') parser.add_argument('--use_backloss', type=int, default=1, help='0 not use 1 use') parser.add_argument('--use_attloss', type=int, default=1, help='0 not use 1 use')
parser.add_argument('--frm_coef', type=float, default=0.85, help='mix up pred and mu') parser.add_argument('--fore_loss_weight', type=float, default=1, help='mix up pred and mu') parser.add_argument('--spl_loss_weight', default=1., help='weight of pseudo label supervision loss') parser.add_argument('--back_loss_weight', default=0.2, help='weight of pseudo label supervision loss') parser.add_argument('--att_loss_weight', default=0.1, help='weight of attention normalization loss') parser.add_argument('--propotion', default=8., help='weight of attention normalization loss') parser.add_argument('--temperature', default=1.0, help='weight of attention normalization loss') parser.add_argument('--weight', default=0.5, help='weight of attention normalization loss') parser.add_argument('--o_weight', default=0.8, help='weight of attention normalization loss') parser.add_argument('--m_weight', default=0.2, help='weight of attention normalization loss') parser.add_argument('--action_cls_num', default=200, help='weight of attention normalization loss') parser.add_argument('--cls_threshold', default=0.02, help='weight of attention normalization loss') parser.add_argument('--nms_thresh', default=0.02, help='weight of attention normalization loss') parser.add_argument('--test_upgrade_scale', default=20, help='weight of attention normalization loss')
parser.add_argument('--video_num', default=9032, help='weight of attention normalization loss') parser.add_argument('--sample_num', default=2, help='weight of attention normalization loss') parser.add_argument('--out_feat_num', default=2048, help='weight of attention normalization loss') parser.add_argument('--momentum', default=0.99, help='weight of attention normalization loss')
parser.add_argument('--dist_loss', default=1., help='weight of pseudo label supervision loss') parser.add_argument('--dist_loss_increase', default=1.0, help='weight of pseudo label supervision loss') parser.add_argument('--dist_power', default=0.05, help='weight of pseudo label supervision loss') parser.add_argument('--dist_temperature', default=0.1, help='weight of pseudo label supervision loss') parser.add_argument('--eval_temperature', default=0.2, help='weight of pseudo label supervision loss') parser.add_argument('--load_dist_ckpt', type=list, default=[]) parser.add_argument('--dist_epoch', type=list, default=[150,160,170, 180,190,200]) parser.add_argument('--dist_temperature_decay', type=list, default=[0.1,0.1,0.01, 0.01,0.01,0.008])
Thanks for your reply!
Thank you for the excellent code. I am wondering if you could share the training config on ActivityNet dataset.