Hello!
Opening a PR which unlocks fw's capability to account for different hyperparameters when loading from an initial model;
Current version freezes hyperparameters when loading from a pre-trained model (-i); should e.g., --learning_rate be present during model fine-tuning with new data, the defaults are always used. By adapting the persistence.rs to check the command line parameters and change the loaded model's hyperparameter values (only the ones specified as constants at the beginning of persistence.rs), hyperparameters can be varied between incremental updates in an online learning setting. Currently supports --learning_rate and --power_t hyperparameters.
Tested by simulating a pretrained model and varying the --learning_rate, which should (and has) reflected in substantially smaller/larger changes of actual weight updates (it's directly related) - the prediction distributions also differ.
Hello! Opening a PR which unlocks
fw
's capability to account for different hyperparameters when loading from an initial model; Current version freezes hyperparameters when loading from a pre-trained model (-i
); should e.g.,--learning_rate
be present during model fine-tuning with new data, the defaults are always used. By adapting the persistence.rs to check the command line parameters and change the loaded model's hyperparameter values (only the ones specified as constants at the beginning ofpersistence.rs
), hyperparameters can be varied between incremental updates in an online learning setting. Currently supports--learning_rate
and--power_t
hyperparameters.Tested by simulating a pretrained model and varying the --learning_rate, which should (and has) reflected in substantially smaller/larger changes of actual weight updates (it's directly related) - the prediction distributions also differ.