PyTorch optimizers that are not supported by the official torch implementation are added as separate classes, this behavior makes the codebase heavier, a fix for this is to base the torch optimizers on the PyTorch-Optimizer (as we've done with torchmetrics).