Souhailkudo / VTM_intra_CNN_LGBM_patch

8 stars 4 forks source link

VTM_intra_CNN_LGBM_patch

Machine Learning based Efficient QT-MTT Partitioning Scheme for VVC Intra Encoders

This patch makes the necessary changes to VTM10.2 to reproduce the results in the paper [1] . CNN and ML models used in this contribution are provided in this project in the needed format for the encoder to work. The scripts to reproduce the training are also provided here.

Usage

VTM10.2 + Complexity reduction:

Training the DL model:

The file DLTraining.py can be used to train the DL model by providing the dataset folder and an output folder, dataset folder should contain 2 folders: "images_npy", each folder contains a folder named "luma", and each luma folder contains 4 folders: "22", "27", "32" and "37", each folders refers to a QP value. These QP folders would contain the data in the form of .npy files (numpy arrays saved using np.save). Files in "images_npy" QP folders are Luma CTUs (68x68) whereas files in "ground_truth_npy" QP folders are ground truth vectors of size 480. Each ground_truth should have the same name as its corresponding CTU. The script is used with the following arguments: [dataset folder] [output folder] After training the model, it can be exported as a Json file using fragally deep in order to use it in the encoder, filename should be "model.json"

Training the LightGBM model:

The file MLDataPrepAndTraining.py can be used for the data preparation and the training of the model. A csv file containing a list of the .npy files that would be used for the training should be prepared beforehead, it should contain 2 columns: "filename" and "qp". This script is used with the following arguments:

Reference

[1] Alexandre Tissier, Wassim Hamidouche, Souhaiel Belhadj Dit Mdalsi, Jarno Vanne, Franck Galpin and Daniel Menard.\ Machine Learning based Efficient QT-MTT Partitioning Scheme for VVC Intra Encoders