Authors: Sedjro Salomon Hotegni , Manuel Berkemeier, and Sebastian Peitz.
Contained within this repository is the source code corresponding to our paper titled "Multi-Objective Optimization for Sparse Deep Multi-Task Learning". A new Multi-Task Learning (MTL) model architecture is introduced, namely the Monitored Deep Multi-Task Network (MDMTN). We investigate the feasibility of adaptively sparsifying the model during training to reduce model complexity and gain computational efficiency without significantly compromising its performance, if we are willing to apply task-specific adaptations to the network weights.
The framework for conducting experiments is based on PyTorch. The code has been tested within the Python 3 environment using JupiterLab, and necessitates the use of some Python packages: pytorch, torchvision, numpy, Pillow, scipy,
and sklearn.
We provide the (code to generate the) newly introduced Cifar10Mnist dataset in /Data_loaders/Create_Cifar10Mnist_dataset.py
.
The script Train_and_Test.py
gathers the helper functions from /src/utils/
for training and testing a model on a dataset, given a preference vector.
We provide scripts for training an MDMTN model on the Cifar10Mnist dataset using our methodology as the default setting. However, these scripts are adaptable and applicable to variations such as the HPS model architecture, the MultiMnist dataset, the MGDA, KDMTL, and MTAN methods. This flexibility is enabled through the utilization of the config.py
file.
To train a model for a specific preference vector $k$ while inducing sparsity, execute the script “example_mdmtn_cm.py”
.
For training a model with the Multiple Gradient Descent Algorithm (MGDA), run the script “example_mdmtn_mgda_cm.py”
.
Use the script named "twoDpf_study_mdmtn_cm.py"
for the additional (2D) Pareto front study we conducted on the model architectures. This required first finding the preference vector that yields the best model performance and saving the obtained model in the directory src/Sparse_models/
.
We utilize specific code segments from Sener Ozan and Vladlen Koltun's work 'Multi-task learning as multi-objective optimization' to train a model using the Multiple Gradient Descent Algorithm (MGDA). Additionally, we incorporate code from Zhang Dejiao et al.'s paper 'Learning to Share: Simultaneous Parameter Tying and Sparsification in Deep Learning' to implement GrOWL as the secondary objective function used. Moreover, we consider the work of Mishkin Dmytro and Jiri Matas in 'All You Need is a Good Init' for the initialization of our models.
If you use this project or its findings in your research, please cite our paper:
@INPROCEEDINGS{10650994,
author={Hotegni, Sedjro S. and Berkemeier, Manuel and Peitz, Sebastian},
booktitle={2024 International Joint Conference on Neural Networks (IJCNN)},
title={Multi-Objective Optimization for Sparse Deep Multi-Task Learning},
year={2024},
doi={10.1109/IJCNN60899.2024.10650994}
}