AolinFeng / PMP-VVC-TIP2023

Partition Map Prediction for Fast Block Partitioning in VVC Intra-frame Coding
8 stars 2 forks source link

作者您好,我已经成功获得了由GenMSBtMap.py得到的.npy文件,但是运行CreateDataSet.py时无法正常运行,亮度分量的partitionMap可以正常获取,但在获取色度分量的partitionMap时程序报错,报错原因是Load_Infe_VP_Dataset()函数中的输出(即CNN的输入)通道数是2(根据您的代码,分别是U和V分量),但CNN输入为3通道,请问您能提供更新后的脚本吗? #4

Closed GongchunDing closed 1 year ago

GongchunDing commented 1 year ago

作者您好,我已经成功获得了由GenMSBtMap.py得到的.npy文件,但CreateDataSet.py无法正常运行,其中亮度分量的partitionMap可以正常获取,但在获取色度分量的partitionMap时程序报错,报错原因是Load_Infe_VP_Dataset()函数的输出(即CNN的输入)通道数是2(根据您的代码,分别是U和V分量),但CNN输入为3通道,请问您能提供更新后的脚本吗?

AolinFeng commented 1 year ago

I am not sure whether I am understanding correctly. Getting the chroma partitionMap should happen in the training set preparation stage. But the Load_Infe_VP_Dataset() function should be used in the inference stage. Why is the error in getting chroma partitionMap caused by Load_Infe_VP_Dataset()?

As for the output of Load_Infe_VP_Dataset() and input of CNN, you can review the Inference_QBD.py and Model_QBD.py for details. Specifically, there are four sub-nets, Luma_Q, Luma_MSBD, Chroma_Q, and Chroma_MSBD. The output of Load_Infe_VP_Dataset() is not directly fed into the Chroma_MSBD or Luma_MSBD. There is a step that concatenates the output QT depth map with the input image block. The inputs of the four sub-nets are actually different.

GongchunDing commented 1 year ago

Thank you for your patient answer. I'm sorry that my above description is not accurate. I still have some questions. What I mean is that during the inference process of inference_VVC(), the Load_Infe_VP_Dataset() function is called. The input parameter pre_path of this function is the processed .npy file., the output is the Y component or UV component, and then the inference_pre_QBD() function is called. When processing the chroma component, the data in infe_loader_QB is the 2-channel UV component processed in the previous step, while the input of Chroma_Q_Net() is 3 channels.

AolinFeng commented 1 year ago

I check the code. Your concern is indeed an issue. The input of Chroma_Q_Net() should be 3 channels, including downsampled Y and UV (for yuv420 format). This design considers that in the encoder, the chroma coding heavily relies on luma coding. So we think the Y component should be helpful to the chroma partition prediction.

The Model_QBD.py should be ok. I think I uploaded the wrong version of inference_QBD.py. I will fix that as soon as possible. Thank you for pointing out the issue.

AolinFeng commented 1 year ago

I have fixed the bugs mentioned above in a new pull request.