Possibly faster training

cvg / glue-factory

Training library for local feature detection and matching

Apache License 2.0

722 stars 90 forks source link

Possibly faster training #39

Open ducha-aiki opened 11 months ago

ducha-aiki commented 11 months ago

This curves are with SuperPoint-Open I have been experimenting with faster training and it seems that OneCycle policy with max_lr=3e-4 provides the same results (based on training and val metrics) than standard, but 2x faster. Maybe that could help others, who would be training on other types of features.

    lr_schedule:
        max_lr: 3e-4
        epochs: 20
        steps_per_epoch: 782
        type: OneCycleLR

iago-suarez commented 11 months ago

Hi @ducha-aiki,

This is a valuable finding in my opinion! Thanks for it 😊

ducha-aiki commented 10 months ago

One can go even faster, but here the final pertaining quality is 1pp lower, so don't recommend

mattiasmar commented 9 months ago

@ducha-aiki Did you update train.py, and in particular get_lr_scheduler, in order for these parameters to be successfully consumed by gluefactory?

ducha-aiki commented 9 months ago

Yes, I did:

def get_lr_scheduler(optimizer, conf):
    """Get lr scheduler specified by conf.train.lr_schedule."""
    if conf.type == 'OneCycleLR':
        max_lr = conf.max_lr
        epochs = conf.epochs
        steps_per_epoch = conf.steps_per_epoch
        max_lr = conf.max_lr
        del conf.max_lr
        del conf.epochs
        del conf.steps_per_epoch
        return getattr(torch.optim.lr_scheduler, conf.type)(optimizer, max_lr, epochs=epochs, steps_per_epoch=steps_per_epoch, **conf.options)
    if conf.type not in ["factor", "exp", None]:
        return getattr(torch.optim.lr_scheduler, conf.type)(optimizer, **conf.options)