Using mealpy optimizers for training pretrained models

thieu1995 / mealpy

A Collection Of The State-of-the-art Metaheuristic Algorithms In Python (Metaheuristic/Optimizer/Nature-inspired/Biology)

https://mealpy.readthedocs.io

GNU General Public License v3.0

905 stars 187 forks source link

Using mealpy optimizers for training pretrained models #121

Closed waseemR02 closed 1 year ago

waseemR02 commented 1 year ago

Hi, I have gone through your tutorials and I have a few questions. My problem statement includes training mobilenet-ssd fpn320 which is a pre-trained model. I have trained it using the optimizers provided by tensorflow. There are only 3 optimizers defined in the proto buffs : Adam, RMS and momentum. and I would like to use some swarm based and bio inspired optimizers namely PSO and IWO. Also I am limited to using the pipeline config file but I think I can unravel the pipeline config to keras apis somehow, but honestly I still don't have a clue on how I should start. I have checked your video on how to integrate a mealpy optimizer for multilayer perceptron model by making a hybrid model but I don't understand how should I go about doing for a pre-trained model like mobilenet-ssd. Thanks. Any help would be appreciated.

waseemR02 commented 1 year ago

@thieu1995 Can you please look into this?

thieu1995 commented 1 year ago

Hi @waseemR02,

Metaheuristic (swarm-based, evolutionary,....) algorithms are gradient-free optimization. Therefor it is completely different than Gradient-based optimization (GD, SGD, Adam, Adagrad,...). Besides, MHAs can't solve a large dimensions problems like deep neural network. You can check several large-scale competition CEC-2023, CEC-2020,.. about evolutionary, they can only handle problem at maximum 1000 dimensions (variables). Deep neural networks (Pre-trained models) are usually a big network with 100k weights or more. You can't really use MHAs to fine-tune this kind of model. In the youtube video, I can make a hybrid model but only with small network.

waseemR02 commented 1 year ago

Is there any other alternative, as it requires me to use some meta heuristic optimizers?What if we just train some input layers and not the hidden layers(just freeze them). Any help would be appreciated.

thieu1995 commented 1 year ago

@waseemR02,

I don't know because I'm not familiar with the pre-trained mobilenet-ssd fpn320 model. I haven't used it before. Perhaps you can treat it as a feature selection problem, choosing which hidden layers to freeze or not. You can use the BinaryVar type to achieve this. Let's say you have 10 hidden layers, and you need to select a combination of layers that yields the best results. For example, you could freeze the 3rd and 8th hidden layers. This is a way to apply metaheuristics to your pre-trained model.