EsakaK/LSSVC - Githubissues

LSSVC: A Learned Spatially Scalable Video Coding Scheme

Official Pytorch implementation for LSSVC: A Learned Spatially Scalable Video Coding Scheme

Prerequisites

Python 3.6
CUDA if want to use GPU for acceleration
cudatoolkit=11.3
pytorch==1.10.1
torchvision==0.11.2
pytorch-msssim==0.2.1
bd-metric==0.9.0

Build the project

We only provide the test script with real bitstream writing. Please build the C++ code to test with actual bitstream writing.

On Linux

sudo apt-get install cmake g++
cd src
mkdir build
cd build
conda activate $YOUR_PY36_ENV_NAME
cmake ../cpp -DCMAKE_BUILD_TYPE=Release
make -j

Comparing with other method

RD curve with intra period 12

RD curve with intra period 32

Pretrained models

We provide our pretrained models:

Scalable Intra models (IntraSS)
Scalable Inter models (LSSVC)

IntraSS is a degraded version of LSSVC without interframe references.

Command Lines

We provide the command lines of the encoder and decoder of HM-18.0, VTM-21.2, and SHM. The VTM-21.2 is utilized for both simulcast coding and two-layer scalable coding in our experiments.

simulcast coding

For simulcast encoding and decoding, it involves performing two separate single-layer encoding and decoding processes. Here, we summarize the command lines used for single-layer coding.

HM-18.0 Encoder

TAppEncoder
-c encoder_lowdelay_main.cfg
--IntraPeriod={intra period}
--FramesToBeEncoded=96
--InputFile={input file name}
--FrameRate={frame rate}
--SourceWidth={width}
--SourceHeight={height}
--InputBitDepth=8
--InputChromaFormat=420
--DecodingRefreshType=2
--ConformanceWindowMode=1
--QP={qp}
--BitstreamFile={bitstream file name}

HM-18.0 Decoder

TAppDecoder 
-b {bitstream file name}
-o {reconstructed file name}

VTM-21.2 Encoder

EncoderApp
-c encoder_lowdelay_vtm.cfg
--IntraPeriod={intra period}
--FramesToBeEncoded=96
--InputFile={input file name}
--FrameRate={frame rate}
--SourceWidth={width}
--SourceHeight={height}
--InputBitDepth=8
--InputChromaFormat=420
--DecodingRefreshType=2
--OutputBitDepth=8
--QP={qp}
--BitstreamFile={bitstream file name}

VTM-21.2 Decoder

DecoderApp 
-b {bitstream file name}
-o {reconstructed file name}

two-layer coding

SHM-12.4 Encoder

TAppEncoder
-c encoder_lowdelay_scalable.cfg
-c layers.cfg
--IntraPeriod0={intra period of layer0}
--InputFile0={input file name of layer0}
--FrameRate0={frame rate of layer0}
--SourceWidth0={width of layer0}
--SourceHeight0={height of layer0}
--QP0={qp of layer0}
--IntraPeriod1={intra period of layer1}
--InputFile1={input file name of layer1}
--FrameRate1={frame rate of layer1}
--SourceWidth1={width of layer1}
--SourceHeight1={height of layer1}
--QP1={qp of layer1}
--FramesToBeEncoded=96
--InputBitDepth=8
--InputChromaFormat=420
--DecodingRefreshType=2
--BitstreamFile={bitstream file name}

SHM-12.4 Decoder

TAppDecoder 
-b {bitstream file name}
-o {reconstructed file name}