Implement setup for multi-label classification

comsaint commented 3 years ago

Previously in @AsadBinImtiaz 's branch, the Dataset object generates multiple labels (e.g. [1, 0, 0, 0, 1, 1, ..., 0]) for each image, which is good for multi-label classification. Unfortunately, I overwrote the work due to time constraint but implemented setting for binary classification instead. Now it is a good time to implement the multi-label setting.

Tasks:

Edit data_processing.py and dataset.py so that the data loader produces a label vector (instead of 1 label).
Change the loss function to binary cross-entropy loss.
Modify the training and evaluation functions in train_model.py to adapt for new loss and metric (Macro-average ROC?). Also, pay attention to the last classification layer, making sure it is not using Softmax.
(Question) how to handle class weight and stratify strategy?
(Question) how to implement early stopping?

Notes:

Do not confuse multi-class with multi-label classification. link
Good reference on loss function. link

comsaint commented 3 years ago

Items 1-3 is resolved by d919051. Q4 is still a puzzle to me - looks like there is no easy solution. Item 5 will be implemented later.

comsaint commented 3 years ago

Class weight is implemented in misc branch and will become a separate issue. Early stopping is implemented as checkpointing the model if its validation loss is better than in previous epochs.

comsaint / dlh_project

Implement setup for multi-label classification #17