cvg / glue-factory

Training library for local feature detection and matching
Apache License 2.0
722 stars 90 forks source link

Possibly faster training #39

Open ducha-aiki opened 11 months ago

ducha-aiki commented 11 months ago

This curves are with SuperPoint-Open I have been experimenting with faster training and it seems that OneCycle policy with max_lr=3e-4 provides the same results (based on training and val metrics) than standard, but 2x faster. Maybe that could help others, who would be training on other types of features.

image image
    lr_schedule:
        max_lr: 3e-4
        epochs: 20
        steps_per_epoch: 782
        type: OneCycleLR
iago-suarez commented 11 months ago

Hi @ducha-aiki,

This is a valuable finding in my opinion! Thanks for it 😊

ducha-aiki commented 10 months ago

One can go even faster, but here the final pertaining quality is 1pp lower, so don't recommend

image
mattiasmar commented 9 months ago

@ducha-aiki Did you update train.py, and in particular get_lr_scheduler, in order for these parameters to be successfully consumed by gluefactory?

ducha-aiki commented 9 months ago

Yes, I did:

def get_lr_scheduler(optimizer, conf):
    """Get lr scheduler specified by conf.train.lr_schedule."""
    if conf.type == 'OneCycleLR':
        max_lr = conf.max_lr
        epochs = conf.epochs
        steps_per_epoch = conf.steps_per_epoch
        max_lr = conf.max_lr
        del conf.max_lr
        del conf.epochs
        del conf.steps_per_epoch
        return getattr(torch.optim.lr_scheduler, conf.type)(optimizer, max_lr, epochs=epochs, steps_per_epoch=steps_per_epoch, **conf.options)
    if conf.type not in ["factor", "exp", None]:
        return getattr(torch.optim.lr_scheduler, conf.type)(optimizer, **conf.options)