tianyili2017 / HEVC-Complexity-Reduction

Source programs to test the deep-learning-based complexity reduction approach for HEVC, at both intra- and inter-modes.
124 stars 80 forks source link

AI训练复现不成功的原因?Valid Loss超大 #13

Open linchengsi opened 5 years ago

linchengsi commented 5 years ago

您好!我下载了您的代码然后直接运行了HEVC-Complexity-Reduction-master\ETH-CNN_Training_AI里面的train_CNN_CTU64.py。因为readme文件里面说是可以直接运行的?所以我也没有进行修改。 然而我得到的结果图跟原本就在models文件夹里面的很不一样,具体如下 image 您提供的结果(原本就在models文件夹里面的)其中一张如下 image 可以看到我的valid loss等等很不正常,数值超大,但是我刚刚进行代码的学习暂时不是很清楚造成问题的原因是什么。请问您能解释一下吗?感谢!

(我的配置是windows10 GTX1060MQ python3.7 tensorflow1.13.1 使用pycharm运行) 还有我的邮箱是1043715625@qq.com

tianyili2017 commented 5 years ago

您好, 程序确实可以直接运行,但因为全部数据库太大,附带的数据文件只是所有数据的一小部分,用于测试程序是否能跑通。如果迭代100万次训练,很容易出现过拟合。我也准备重新下载一次、运行,看结果是否像您那样?如果希望复现论文结果,推荐您按照训练说明,由原始YUV视频和标签生成完整数据文件,再训练。谢谢您指出本程序的一个问题,关注我们的工作!

tianyili2017 commented 5 years ago

您好。我重新运行一遍,结果和您的图像很相似,说明在小样本上确实会出现过拟合。之后,准备在训练说明里标注一下:demo sets只用于使程序运行,可以用来验证Tensorflow等环境是否正确安装,但不用于复现论文结果。欲复现与论文接近的结果,请先生成完整训练数据。谢谢您理解!

zhoushuairan commented 4 years ago

您好,我下载您的代码之后运行HEVC-Complexity-Reduction-master\HM-16.5_Test_AI\build\HM_vc10.sln 结果出现如下信息 Tensor("Conv2D:0", shape=(?, 1, 1, 1), dtype=float32) Tensor("ResizeNearestNeighbor:0", shape=(?, 16, 16, 1), dtype=float32) Tensor("LeakyRelu:0", shape=(?, 4, 4, 16), dtype=float32) Tensor("LeakyRelu_1:0", shape=(?, 2, 2, 24), dtype=float32) Tensor("LeakyRelu_2:0", shape=(?, 1, 1, 32), dtype=float32) Tensor("Conv2D_4:0", shape=(?, 2, 2, 1), dtype=float32) Tensor("ResizeNearestNeighbor_1:0", shape=(?, 32, 32, 1), dtype=float32) Tensor("LeakyRelu_3:0", shape=(?, 8, 8, 16), dtype=float32) Tensor("LeakyRelu_4:0", shape=(?, 4, 4, 24), dtype=float32) Tensor("LeakyRelu_5:0", shape=(?, 2, 2, 32), dtype=float32) Tensor("Conv2D_8:0", shape=(?, 4, 4, 1), dtype=float32) Tensor("ResizeNearestNeighbor_2:0", shape=(?, 64, 64, 1), dtype=float32) Tensor("LeakyRelu_6:0", shape=(?, 16, 16, 16), dtype=float32) Tensor("LeakyRelu_7:0", shape=(?, 8, 8, 24), dtype=float32) Tensor("LeakyRelu_8:0", shape=(?, 4, 4, 32), dtype=float32) Tensor("concat:0", shape=(?, 2688), dtype=float32) Tensor("cond/Merge:0", shape=(?, 64), dtype=float32) Tensor("cond_1/Merge:0", shape=(?, 48), dtype=float32) Tensor("cond_2/Merge:0", shape=(?, 1), dtype=float32) Tensor("cond_3/Merge:0", shape=(?, 128), dtype=float32) Tensor("cond_4/Merge:0", shape=(?, 96), dtype=float32) Tensor("cond_5/Merge:0", shape=(?, 4), dtype=float32) Tensor("cond_7/Merge:0", shape=(?, 256), dtype=float32) Tensor("cond_8/Merge:0", shape=(?, 192), dtype=float32) Tensor("cond_9/Merge:0", shape=(?, 16), dtype=float32) D:\HM\HM-16.5_Test_AI\bin\vc10\x64\Release\BasketballPass_416x240_50.yuv frame 501/501 416x240

Predicting Time : 7.986 sec.


HM software: Encoder Version [16.5] (including RExt)[Windows][VS 1900][64 bit]

python video_to_cu_depth.py D:\HM\HM-16.5_Test_AI\bin\vc10\x64\Release\BasketballPass_416x240_50.yuv 416 240 32

Input File : D:\HM\HM-16.5_Test_AI\bin\vc10\x64\Release\BasketballPass_416x240_50.yuv Bitstream File : str.bin Reconstruction File : rec.yuv Real Format : 416x240 50Hz Internal Format : 416x240 50Hz Sequence PSNR output : Linear average only Sequence MSE output : Disabled Frame MSE output : Disabled Cabac-zero-word-padding : Enabled Frame/Field : Frame based coding Frame index : 0 - 99 (100 frames) Profile : main CU size / depth / total-depth : 64 / 4 / 4 RQT trans. size (min / max) : 4 / 32 Max RQT depth inter : 3 Max RQT depth intra : 3 Min PCM size : 8 Motion search range : 64 Intra period : 1 Decoding refresh type : 0 QP : 32.00 Max dQP signaling depth : 0 Cb QP Offset : 0 Cr QP Offset : 0 QP adaptation : 0 (range=0) GOP size : 1 Input bit depth : (Y:8, C:8) MSB-extended bit depth : (Y:8, C:8) Internal bit depth : (Y:8, C:8) PCM sample bit depth : (Y:8, C:8) Intra reference smoothing : Enabled diff_cu_chroma_qp_offset_depth : -1 extended_precision_processing_flag : Disabled implicit_rdpcm_enabled_flag : Disabled explicit_rdpcm_enabled_flag : Disabled transform_skip_rotation_enabled_flag : Disabled transform_skip_context_enabled_flag : Disabled cross_component_prediction_enabled_flag: Disabled high_precision_offsets_enabled_flag : Disabled persistent_rice_adaptation_enabled_flag: Disabled cabac_bypass_alignment_enabled_flag : Disabled log2_sao_offset_scale_luma : 0 log2_sao_offset_scale_chroma : 0 Cost function: : Lossy coding (default) RateControl : 0 Max Num Merge Candidates : 5

TOOL CFG: IBD:0 HAD:1 RDQ:1 RDQTS:1 RDpenalty:0 SQP:0 ASR:0 FEN:1 ECU:0 FDM:1 CFM:0 ESD:0 RQT:1 TransformSkip:1 TransformSkipFast:1 TransformSkipLog2MaxSize:2 Slice: M=0 SliceSegment: M=0 CIP:0 SAO:1 PCM:0 TransQuantBypassEnabled:0 WPP:0 WPB:0 PME:2 WaveFrontSynchro:0 WaveFrontSubstreams:1 ScalingList:0 TMVPMode:1 AQpS:0 SignBitHidingFlag:1 RecalQP:0

Non-environment-variable-controlled macros set as follows:

                            RExt__DECODER_DEBUG_BIT_STATISTICS =   0
                                  RExt__HIGH_BIT_DEPTH_SUPPORT =   0
                        RExt__HIGH_PRECISION_FORWARD_TRANSFORM =   0
                                    O0043_BEST_EFFORT_DECODING =   0

               Input ChromaFormatIDC =   4:2:0
   Output (internal) ChromaFormatIDC =   4:2:0

`` 于是我单独运行D:\HM\HM-16.5_Test_AI\bin下文件video_to_cu_depth.py文件 出现 File"video_to_cu_depth.py",line 120,in assert len(sys.argv)==5 AssertionError 请问是为什么呢???如果你能回答 很感谢!