TDT4173-Machine-Learning

TDT4137 Machine Learning

Project Description

This is the main project of the NTNU course TDT4137 - Machine Learning. The paper regarding this project can be found here.

The purpose of this project is to compare classic Convolutional Neural Network (CNN or convpool in this project) to the new Capsule neural network (Capsnet) on the same dataset. We developed a program with our implementation of CNN based and an adaptation of a Capsnet implementation, both based on the Pythorch framework. The original Capsnet implementation (All rights to jindongwang) can be found here: Modules for Image Augmentation and Standardization of the dataset are also implemented in this project.

The runtime of this project is configurable by choosing witch classifier utilize (convpool or capsnet) and a mode between "train and evaluate" or "conduct a study". Both the implementations supports the use of a CUDA enabled graphic card for speeding up calculations on both modes.


If not specified, the program will create or load a "convpool" model. In the user guide is shown how to customize the runtime of the program with parameters. Some pre-trained models can be loaded in the program for evaluation purposes. Due to the high size, you can download it from this link and place it in the project root. More instructions follows the download.

"Train and Evaluate" mode

In this mode, the program will train the selected model on the whished parameters. The program gives the possibility to save and load trained classifier weights. There are options for plotting statistics about the training prosess like the loss plot per epoch. The program then proceedes to evaluate the model here. The training phase can be skipped, in that case the program will only evaluate the model. Training of the model is conducted by this function that redirects the program to the right training routine for capsnet and convpool.

"Conduct a study" mode

In this mode, the program will conduct a study on the selected classifier for finding the best hyperparameters. Both the number of trials and hyperparameters' value range is configurable. A study is then saved as a CSV file and as a object dump. Graphs are generating visualising importance of each hyperparameter in the final score, trial-score and a empirical distribution of all the results.

This mode is implemented in Source\, where the conduct_study(n_trials, classifier_type) will start a new study and initializate a model with the configuration found on the top of the file. Then it procedes by iterate through the objective(trial) function that will fully train the model with the passed values, evaluated and saved in the study object. Since the program evaluate the model after each training epoch, a non promising trial can be pruned for time optimization. Parameter optimalization is implemented with the Optuna library, documentation can be found here:


Dataset is composed of 4242 pictures of flowers divided in 5 categories: Sunflower, Rose, Dendelion, Daisy and Tulip. Source of the dataset :

Installation Guide

The project is developed used Python version 3.7, but newer version should be compatible too. Compatibility with Python 2.7 is not assured.

CUDA Toolkit installation

CUDA Toolkit is necessary if you want to utilize your NVIDIA GPU for tensors operations. Check if you have a compatibel GPU here: CUDA Toolkit used when development on this project: CUDA 10.2 Other versions should work too but not assured. Installation guide for Windows and Linux:

Pythorch installation

If pythorch is not installed, follow the guide found on the official Pythorch website for installing the correct version for your machine. Different versions are available, both with support of different CUDA versions and only CPU support. Pythorch packages version used when developing this project: "torch===1.7.0 torchvision===0.8.1 torchaudio===0.7.0" Official starting guide:

Install required packages

Run "pip install -r requirements.txt" for installing the lasts of the required packages for this project.

If you use a Python version older then 3.2, run this program for installing ulterior needed packages: "pip install argparse pathlib".

User Guide

Run "" for starting the program. The "" script accepts the following parameters:

Some examples of use:


Train and Evaluate mode

The variables for both convpool and capsnett models are found in the "" file.
convpool_cfg =  {
        "type": "ConvPool", #CapsNet or ConvPool
        "image_size": (100, 100), # (X, Y)                                         
        "learning_rate": 6.34192248576476e-05,
        "mini_batch_size": 32, #Amount of images per batch
        "test_batch_size": 20, #Images per category to test on
        "step_size": 32, #Amount of steps per Epoch
        "epochs": 200,
        "dropout": True,
        "dropout_rate": 0.4,
        "prnt": True,  #Print more informations about the errors after evaluation
        "optimizer": optim.Adam,
        "criterion": nn.MSELoss(),
        "save_weights": True

### Settings for tweaking training and testing with Capsule neural network
capsnet_cfg = {
        "type": "CapsNet", #CapsNet or ConvPool
        "image_size": (100, 100), # (X, Y)                                         
        "learning_rate": 0.0077,
        "mini_batch_size": 10, #Amount of images per batch
        "test_batch_size": 20, #Images per category to test on
        "step_size": 10, #Amount of steps per Epoch
        "epochs": 50,
        "dropout": True,
        "dropout_rate": 0.4,
        "prnt": True, #Print more informations about the errors after evaluation
        "optimizer": optim.Adam,
        "criterion": nn.MSELoss(),
        "save_weights": True

Conduct a study mode

The variables used when conduct a study are found under the "objective" function in the "" file. It utilizes a Optuna.trial.trial object that, for each study, will suggest parameters out of the value range manually defined in the cfg object. There are different "suggest" methods with different values distributions. Documentation is found here: Results of the conducted study can be found under Results\Study\csv. A dump of the "study" object is found in Results\study, usefull for future analisis or fuctionality implementation. Documentation
def objective(trial):

    cfg = {
        "type": "CapsNet", #CapsNet or ConvPool
        "image_size": trial.suggest_categorical('image_size', [(224, 224), (180, 180), (150, 150),(300, 300)]),
        "learning_rate": trial.suggest_loguniform('lr', low=1e-3, high=1e-2),
        "mini_batch_size": 32, #Amount of images per batch
        "test_batch_size": 20,  #Images per category to test on
        "step_size": 10, #Amount of steps per Epoch
        "epochs": trial.suggest_int('epochs', low=30, high=60, step=5),
        "dropout": True,
        "dropout_rate":trial.suggest_discrete_uniform('droput_rate', low=0.1, high=0.5, q=0.1),
        "prnt": False,
        "optimizer": optim.Adam, #trial.suggest_categorical('optimizer', [optim.Adam, optim.SGD]),
        "criterion": nn.MSELoss(),
        "save_weights": False

Folder structure

|  |
|  |___Processed
|      |
|      |__test
|      |
|      |__train
   |  |
   |  |__capsnet
   |  |
   |  |__convpool

Further Work

This project is presented in a dedicated paper that would be the goal for this course and it will be considered finished upon delivery of such paper. Nevertheless the design and functionality of this program can be improved and extended. Faster training time can be reached by redesining the way the program manages the pictures and tensors. Better file structure, variable separation and user interface are also point where improvement is possible.