sacmehta / EdgeNets

This repository contains the source code of our work on designing efficient CNNs for computer vision
MIT License
411 stars 82 forks source link

Question about the Experiment of Image Multi-label Classification #10

Closed mymuli closed 5 years ago

mymuli commented 5 years ago

In the paper 《DiCENet: Dimension-wise Convolutions for Efficient Networks》, the network width scaling parameter s can be selected, but in the experiment of image multi-label classification, the experimental error of s=0.1 (when I run the corresponding program of s=0.2, the machine can't run, but the network parameter of s=0.1 is less than half) ... can you provide a program for s = 0.1?

sacmehta commented 5 years ago

Could you please elaborate on the error details?

mymuli commented 5 years ago

@sacmehta python train_classification.py --model dicenet \ --scheduler hybrid --clr-max 61 --lr 0.1 \ --data ./coco-image --dataset coco \ --epochs 100 --batch-size 64 --s 0.2

Execute the commands of the above program. When I set s to 0.2, the program would be stuck there because of the small memory of my machine graphics card.

微信图片_20190804152753

×#×---------------------------------------------------------------------------------------------------

2019-08-04 15-39-32屏幕截图

In Table 1, when s = 0.2, FLOPs is 12M, and when s = 0.1, FLOPs is 6.5M. So I want to set s to 0.1.

python train_classification.py --model dicenet \ --scheduler hybrid --clr-max 61 --lr 0.1 \ --data ./coco-image --dataset coco \ --epochs 100 --batch-size 64 --s 0.1

In the experiment of image multi-label classification, execute the command of the above program. When I set s to 0.1, the program will report the error in the figure below.

bug-1

So, Could you tell me how to do it? Thank you very much.

sacmehta commented 5 years ago

Uncomment this line and it should work. Note that we have not provided pretrained ImageNet weights at this configuration, so you might want to train on theImageNet first.

https://github.com/sacmehta/EdgeNets/blob/08f11290b2918f743dfe56f4f5f73e2b6213a17a/model/classification/dicenet_config.py#L7

mymuli commented 5 years ago

@sacmehta Thank you very much for your reply. And I have another question.

In the paper 《ELASTIC: Improving CNNs with Dynamic Scaling Policies》, the data set used is mscoco 2014, the size of training set/verification set is 82783/40504, the data set cited in 《DiCENet: Dimension-wise Convolutions for Efficient Networks》 and《ESPNetv2: A Light-weight, Power Efficient, and General Purpose Convolutional Neural Network》 are mscoco 2014, but in the corresponding experiment of DiCENet, mscoco 2017 is used, and the training set/verification set is 118287/5000. Because of the difference between training set and test set in DiCENet paper, the comparison between DiCENet and ELASTI will not be a problem?

Code segment corresponding to ELASTIC: coco2014

Code segment corresponding to DiCENet: coco2017

sacmehta commented 5 years ago

In our paper, we used the same split as ELASTIC.

sacmehta commented 5 years ago

Since COCO2017 is the latest version, we provided the default values to 2017 split. You can change them if you want.

mymuli commented 5 years ago

@sacmehta

Code modified to version of data set 2014:

`# -- coding: utf-8 -- import torch import argparse import torch.optim import torch.utils.data import torch.utils.data.distributed from data_loader.classification import imagenet as img_loader import random import os from tensorboardX import SummaryWriter import time from utilities.utils import model_parameters, compute_flops from utilities.utils import save_checkpoint import numpy as np from utilities.print_utils import * from torch import nn from PIL import Image import torchvision.transforms as transforms import torchvision.datasets as datasets

""" author = "Sachin Mehta" maintainer = "Sachin Mehta" """

class CocoDetection(datasets.coco.CocoDetection): def init(self, root, annFile, transform=None, target_transform=None): from pycocotools.coco import COCO self.root = root self.coco = COCO(annFile) self.ids = list(self.coco.imgs.keys()) self.transform = transform self.target_transform = target_transform self.cat2cat = dict() for cat in self.coco.cats.keys(): self.cat2cat[cat] = len(self.cat2cat)

print(self.cat2cat)

def __getitem__(self, index):
    coco = self.coco
    img_id = self.ids[index]
    ann_ids = coco.getAnnIds(imgIds=img_id)
    target = coco.loadAnns(ann_ids)

    output = torch.zeros((3, 80), dtype=torch.long)
    for obj in target:
        if obj['area'] < 32 * 32:
            output[0][self.cat2cat[obj['category_id']]] = 1
        elif obj['area'] < 96 * 96:
            output[1][self.cat2cat[obj['category_id']]] = 1
        else:
            output[2][self.cat2cat[obj['category_id']]] = 1
    target = output

    path = coco.loadImgs(img_id)[0]['file_name']
    img = Image.open(os.path.join(self.root, path)).convert('RGB')
    if self.transform is not None:
        img = self.transform(img)

    if self.target_transform is not None:
        target = self.target_transform(target)
    return img, target

def main(args):

-----------------------------------------------------------------------------

# Create model
# -----------------------------------------------------------------------------
if args.model == 'dicenet':
    from model.classification import dicenet as net
    model = net.CNNModel(args)
elif args.model == 'espnetv2':
    from model.classification import espnetv2 as net
    model = net.EESPNet(args)
elif args.model == 'shufflenetv2':
    from model.classification import shufflenetv2 as net
    model = net.CNNModel(args)
else:
    print_error_message('Model {} not yet implemented'.format(args.model))
    exit()

if args.finetune:
    # laod the weights for finetuning
    if os.path.isfile(args.weights_ft):
        pretrained_dict = torch.load(args.weights_ft, map_location=torch.device('cpu'))
        print_info_message('Loading pretrained basenet model weights')
        model_dict = model.state_dict()

        overlap_dict = {k: v for k, v in model_dict.items() if k in pretrained_dict}

        total_size_overlap = 0
        for k, v in enumerate(overlap_dict):
            total_size_overlap += torch.numel(overlap_dict[v])

        total_size_pretrain = 0
        for k, v in enumerate(pretrained_dict):
            total_size_pretrain += torch.numel(pretrained_dict[v])

        if len(overlap_dict) == 0:
            print_error_message('No overlaping weights between model file and pretrained weight file. Please check')

        print_info_message('Overlap ratio of weights: {:.2f} %'.format(
            (total_size_overlap * 100.0) / total_size_pretrain))

        model_dict.update(overlap_dict)
        model.load_state_dict(model_dict, strict=False)
        print_info_message('Pretrained basenet model loaded!!')
    else:
        print_error_message('Unable to find the weights: {}'.format(args.weights_ft))

# -----------------------------------------------------------------------------
# Writer for logging
# -----------------------------------------------------------------------------
if not os.path.isdir(args.savedir):
    os.makedirs(args.savedir)
writer = SummaryWriter(log_dir=args.savedir, comment='Training and Validation logs')
try:
    writer.add_graph(model, input_to_model=torch.randn(1, 3, args.inpSize, args.inpSize))
except:
    print_log_message("Not able to generate the graph. Likely because your model is not supported by ONNX")

# network properties
num_params = model_parameters(model)
flops = compute_flops(model)
print_info_message('FLOPs: {:.2f} million'.format(flops))
print_info_message('Network Parameters: {:.2f} million'.format(num_params))

# -----------------------------------------------------------------------------
# Optimizer
# -----------------------------------------------------------------------------

optimizer = torch.optim.SGD(model.parameters(), args.lr, momentum=args.momentum, weight_decay=args.weight_decay)

# optionally resume from a checkpoint
best_acc = 0.0
num_gpus = torch.cuda.device_count()
print("num_gpus: ", num_gpus)
device = 'cuda' if num_gpus >= 1 else 'cpu'
print("device: ", device)
print("***********************************")
if args.resume:
    if os.path.isfile(args.resume):
        print_info_message("=> loading checkpoint '{}'".format(args.resume))
        checkpoint = torch.load(args.resume)
        args.start_epoch = checkpoint['epoch']
        best_acc = checkpoint['best_prec1']
        model.load_state_dict(checkpoint['state_dict'], map_location=torch.device(device))
        optimizer.load_state_dict(checkpoint['optimizer'])
        print_info_message("=> loaded checkpoint '{}' (epoch {})"
                           .format(args.resume, checkpoint['epoch']))
    else:
        print_warning_message("=> no checkpoint found at '{}'".format(args.resume))

# -----------------------------------------------------------------------------
# Loss Fn
# -----------------------------------------------------------------------------
if args.dataset == 'imagenet':
    criterion = nn.CrossEntropyLoss()
    acc_metric = 'Top-1'
elif args.dataset == 'coco':
    criterion = nn.BCEWithLogitsLoss()
    acc_metric = 'F1'
else:
    print_error_message('{} dataset not yet supported'.format(args.dataset))

if num_gpus >= 1:
    model = torch.nn.DataParallel(model)
    model = model.cuda()
    criterion = criterion.cuda()
    if torch.backends.cudnn.is_available():
        import torch.backends.cudnn as cudnn
        cudnn.benchmark = True
        cudnn.deterministic = True

# -----------------------------------------------------------------------------
# Data Loaders
# -----------------------------------------------------------------------------
# Data loading code
if args.dataset == 'imagenet':
    train_loader, val_loader = img_loader.data_loaders(args)
    # import the loaders too
    from utilities.train_eval_classification import train, validate
elif args.dataset == 'coco':
    # from data_loader.classification.coco import COCOClassification
    # train_dataset = COCOClassification(root=args.data, split='train', year='2014', inp_size=args.inpSize,scale=args.scale, is_training=True) # 2017
    # val_dataset = COCOClassification(root=args.data, split='val', year='2014', inp_size=args.inpSize,is_training=False) # 2017
    # 数据集处理
    # train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=args.batch_size, shuffle=True,
                                              # pin_memory=True, num_workers=args.workers)
    # val_loader = torch.utils.data.DataLoader(val_dataset, batch_size=args.batch_size, shuffle=False,
                                             # pin_memory=True, num_workers=args.workers)
    #

    # Data loading code  
    # Elastic论文里面的代码
    normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                      std=[0.229, 0.224, 0.225])

    train_dataset = CocoDetection(os.path.join(args.data, 'images/train2014'),
                              os.path.join(args.data, 'annotations/instances_train2014.json'),
                              transforms.Compose([
                                  transforms.RandomResizedCrop(224),
                                  transforms.RandomHorizontalFlip(),
                                  transforms.ToTensor(),
                                  normalize,
                              ]))
    val_dataset = CocoDetection(os.path.join(args.data, 'images/val2014'),
                            os.path.join(args.data, 'annotations/instances_val2014.json'),
                            transforms.Compose([
                                transforms.Resize((224, 224)),
                                transforms.ToTensor(),
                                normalize,
                            ]))

    train_sampler = torch.utils.data.sampler.RandomSampler(train_dataset)
    # 训练集
    train_loader = torch.utils.data.DataLoader(
        train_dataset, batch_size=args.batch_size, shuffle=(train_sampler is None),
        num_workers=args.workers, pin_memory=True, sampler=train_sampler, drop_last=True)
    # 验证集
    val_loader = torch.utils.data.DataLoader(
        val_dataset, batch_size=args.batch_size, shuffle=False,
        num_workers=args.workers, pin_memory=True)

    # import the loaders too
    from utilities.train_eval_classification import train_multi as train
    from utilities.train_eval_classification import validate_multi as validate
else:
    print_error_message('{} dataset not yet supported'.format(args.dataset))

# -----------------------------------------------------------------------------
# LR schedulers  学习率策略
# -----------------------------------------------------------------------------
if args.scheduler == 'fixed':
    step_sizes = args.steps
    from utilities.lr_scheduler import FixedMultiStepLR
    lr_scheduler = FixedMultiStepLR(base_lr=args.lr, steps=step_sizes, gamma=args.lr_decay)
elif args.scheduler == 'clr':
    from utilities.lr_scheduler import CyclicLR
    step_sizes = args.steps
    lr_scheduler = CyclicLR(min_lr=args.lr, cycle_len=5, steps=step_sizes, gamma=args.lr_decay)
elif args.scheduler == 'poly':
    from utilities.lr_scheduler import PolyLR
    lr_scheduler = PolyLR(base_lr=args.lr, max_epochs=args.epochs)
elif args.scheduler == 'linear':
    from utilities.lr_scheduler import LinearLR
    lr_scheduler = LinearLR(base_lr=args.lr, max_epochs=args.epochs)
elif args.scheduler == 'hybrid':
    from utilities.lr_scheduler import HybirdLR
    lr_scheduler = HybirdLR(base_lr=args.lr, max_epochs=args.epochs, clr_max=args.clr_max)
else:
    print_error_message('Scheduler ({}) not yet implemented'.format(args.scheduler))
    exit()
print("学习率策略: ", args.scheduler)
print_info_message(lr_scheduler)

# set up the epoch variable in case resuming training
if args.start_epoch != 0:
    for epoch in range(args.start_epoch):
        lr_scheduler.step(epoch)

with open(args.savedir + os.sep + 'arguments.json', 'w') as outfile:
    import json
    arg_dict = vars(args)
    arg_dict['model_params'] = '{} '.format(num_params)
    arg_dict['flops'] = '{} '.format(flops)
    json.dump(arg_dict, outfile)

# -----------------------------------------------------------------------------
# Training and Val Loop
# -----------------------------------------------------------------------------

extra_info_ckpt = args.model + '_' + str(args.s)
for epoch in range(args.start_epoch, args.epochs):
    lr_log = lr_scheduler.step(epoch)
    # set the optimizer with the learning rate
    # This can be done inside the MyLRScheduler
    for param_group in optimizer.param_groups:
        param_group['lr'] = lr_log
    print_info_message("LR for epoch {} = {:.5f}".format(epoch, lr_log))
    train_acc, train_loss = train(data_loader=train_loader, model=model, criteria=criterion, optimizer=optimizer, epoch=epoch, device=device)
    # evaluate on validation set
    val_acc, val_loss = validate(data_loader=val_loader, model=model, criteria=criterion, device=device)

    # remember best prec@1 and save checkpoint
    is_best = val_acc > best_acc
    best_acc = max(val_acc, best_acc)

    weights_dict = model.module.state_dict() if device == 'cuda' else model.state_dict()
    save_checkpoint({
        'epoch': epoch + 1,
        'state_dict': weights_dict,
        'best_prec1': best_acc,
        'optimizer': optimizer.state_dict(),
    }, is_best, args.savedir, extra_info_ckpt)

    writer.add_scalar('Classification/LR/learning_rate', lr_log, epoch)
    writer.add_scalar('Classification/Loss/Train', train_loss, epoch)
    writer.add_scalar('Classification/Loss/Val', val_loss, epoch)
    writer.add_scalar('Classification/{}/Train'.format(acc_metric), train_acc, epoch)
    writer.add_scalar('Classification/{}/Val'.format(acc_metric), val_acc, epoch)
    writer.add_scalar('Classification/Complexity/Top1_vs_flops', best_acc, round(flops, 2))
    writer.add_scalar('Classification/Complexity/Top1_vs_params', best_acc, round(num_params, 2))

writer.close()

if name == 'main': from commons.general_details import classification_models, classification_datasets, classification_exp_choices, \ classification_schedulers

parser = argparse.ArgumentParser(description='Training efficient networks')
# General settings
parser.add_argument('--workers', default=4, type=int, help='number of data loading workers (default: 4)')  # 12
parser.add_argument('--batch-size', default=64, type=int, help='mini-batch size (default: 512)')  # 512

# Dataset related settings
parser.add_argument('--data', default='./coco-image', help='path to dataset')
parser.add_argument('--dataset', default='coco', help='Name of the dataset', choices=classification_datasets)

# LR scheduler settings
parser.add_argument('--epochs', default=10, type=int, help='number of total epochs to run')  # 300
parser.add_argument('--start-epoch', default=0, type=int, help='manual epoch number (useful on restarts)')
parser.add_argument('--clr-max', default=61, type=int, help='Max. epochs for CLR in Hybrid scheduler')
parser.add_argument('--steps', default=[51, 101, 131, 161, 191, 221, 251, 281], type=int, nargs="+",
                    help='steps at which lr should be decreased. Only used for Cyclic and Fixed LR')
parser.add_argument('--scheduler', default='clr', choices=classification_schedulers,  # 循环学习率
                    help='Learning rate scheduler')
parser.add_argument('--lr', default=0.1, type=float, help='initial learning rate')
parser.add_argument('--lr-decay', default=0.5, type=float, help='factor by which lr should be decreased')

# optimizer settings
parser.add_argument('--momentum', default=0.9, type=float, help='momentum')
parser.add_argument('--weight-decay', default=4e-5, type=float, help='weight decay (default: 4e-5)')
parser.add_argument('--resume', default='', type=str, help='path to latest checkpoint (default: none)')
parser.add_argument('--savedir', type=str, default='results_classification', help='Location to save the results')

# Model settings
parser.add_argument('--s', default=1.0, type=float, help='Factor by which output channels should be scaled (s > 1 for increasing the dims while < 1 for decreasing)')
parser.add_argument('--inpSize', default=224, type=int, help='Input image size (default: 224 x 224)')
parser.add_argument('--scale', default=[0.2, 1.0], type=float, nargs="+", help='Scale for data augmentation')
parser.add_argument('--model', default='shuffle_vw', choices=classification_models,
                    help='Which model? basic= basic CNN model, res=resnet style)')
parser.add_argument('--channels', default=3, type=int, help='Input channels')
# DiceNet related settings
parser.add_argument('--model-width', default=224, type=int, help='Model width')
parser.add_argument('--model-height', default=224, type=int, help='Model height')

## Experiment related settings
parser.add_argument('--exp-type', type=str, choices=classification_exp_choices, default='main',
                    help='Experiment type')
parser.add_argument('--finetune', action='store_true', default=False, help='Finetune the model')  # 微调模型

args = parser.parse_args()

assert len(args.scale) == 2
args.scale = tuple(args.scale)

random.seed(1882)
torch.manual_seed(1882)

timestr = time.strftime("%Y%m%d-%H%M%S")
args.savedir = '{}_{}/model_{}_{}/aug_{}_{}/s_{}_inp_{}_sch_{}/{}/'.format(args.savedir, args.exp_type, args.model,
                                                                           args.dataset, args.scale[0],
                                                                           args.scale[1],
                                                                           args.s, args.inpSize, args.scheduler,
                                                                           timestr)

# if you want to finetune ImageNet model on other dataset, say MS-COCO classification
if args.finetune:
    print_info_message('Grabbing location of the ImageNet weights from the weight dictionary')
    from model.weight_locations.classification import model_weight_map

    weight_file_key = '{}_{}'.format(args.model, args.s)
    assert weight_file_key in model_weight_map.keys(), '{} does not exist'.format(weight_file_key)
    args.weights_ft = model_weight_map[weight_file_key]

if args.dataset == 'imagenet':
    args.num_classes = 1000
elif args.dataset == 'coco':
    from data_loader.classification.coco import COCO_CLASS_LIST
    args.num_classes = len(COCO_CLASS_LIST)

main(args)`

Dear Sir, I changed the program to the data set version of 2014, but after the first epoch of code running, the result was very bad.. At the beginning of the second epoch, the values of Precision and Recall were 0... How can I solve this problem? If possible, could you provide a program for the data set version of 2014? Thank you very much.

DiCENets-coco2014

sacmehta commented 5 years ago

You need to use lesser value of learning rate. Try 0.005

mymuli commented 5 years ago

@sacmehta Dear Sir,

I use dicenet_s_0.2_imagenet_224x224.pth file to fine-tune on mscoco, using the following commands: python train_classification.py --model dicenet \ --scheduler fixed -- lr 0.005 \ --data 1-mscoco-image --dataset coco \ --epochs 4 --batch_size 64 --s 0.2 --finetune

The experimental process is as follows: f1

The results of each epoch are as follows: epoch-1: P_C 7.34 R_C 1.55 F_C 1.50 P_O 57.24 R_O 18.15 F_O 27.56
epoch-2: P_C 14.77 R_C 2.39 F_C 3.39 P_O 71.41 R_O 12.82 F_O 21.74
epoch-3: P_C 21.93 R_C 4.68 F_C 6.16 P_O 66.84 R_O 17.82 F_O 28.14
epoch-4: P_C 25.78 R_C 4.75 F_C 6.29 P_O 69.06 R_O 18.25 F_O 28.87

Although there are improvements in each round, the effect is not very great.

------------------------------------------------------------------------------------------

I used clr(scheduler ) to carry out another group of experiments. python train_classification.py --model dicenet \ --scheduler clr --lr 0.005 \ --data 1-mscoco-image --dataset coco \ --epochs 2 --batch_size 64 --s 0.2 --finetune

The experimental results are as follows:

k1-cylr-2

k2-cylr-2

In your training process, R_C and F_C are also slowly rising from a very small number (for example, in my experiment, 1.21/0.88)?

On my desktop, GTX 970, batch size is 64, each round takes 40 minutes. If it takes 2.7 days to train 100 rounds, I don't know if the F_C value can exceed 71.08 (ELASTIC).

Did my experiment go wrong? Can you give me some advice?

Thank you very much for your patient reply !

sacmehta commented 5 years ago

With s=0.2, you cannot reach the value close to Elastic paper. You need to use the best Dicenet model.

Also, batch size of 64 is too small. Try using something like 512 for best results.

Since you are able to run the code and it is more of hyper-parameter tuning based on your machine setup which is beyond this repo, I am closing this issue.