ryanxingql / stdf-pytorch

Implementation of "Spatio-Temporal Deformable Convolution for Compressed Video Quality Enhancement" (AAAI'20).
https://www.aiide.org/ojs/index.php/AAAI/article/view/6697
Apache License 2.0
155 stars 20 forks source link

question about create_lmdb #10

Closed limwangkai closed 2 years ago

limwangkai commented 2 years ago

Hi, thank you for sharing the project, I want to know why when creating LMDB, ground truth uses only center frames of each sequence, while LQ uses all the frames.

ryanxingql commented 2 years ago

Hi limwangkai

Because in STDF, ground truth frames are used only for supervision (calculating loss).

If you use Vimeo dataset, each sequence has 7 frames (1, 2, ..., 7). We feed this 7-frame compressed sequence into STDF to enhance the center 4-th frame; we then get one enhanced center 4-th frame. Then, we use the corresponding center ground truth 4-th frame to get loss, e.g., MSE loss. Therefore, we need 7 compressed frames and only 1 ground frame, which corresponds to the center compressed frame.

If you use other datasets such as MFQEv2 dataset, each video is separated into 7-frame sequences for training. These sequences can be overlapping:

sequence one: 1, 2, 3, 4, 5, 6, 7, the center frame is the 4-th frame. sequence two: 2, 3, 4, 5, 6, 7, 8, the center frame is the 5-th frame. sequence three: 3, 4, 5, 6, 7, 8, 9, the center frame is the 6-th frame. ...

limwangkai commented 2 years ago

Thanks for your reply, in create_lmdb_mfqev2, the code of generate lmdb for GT is: num_seq = nfs // (2radius+1) frm_list.append([radius + iter_seq (2 * radius + 1) for iter_seq in range(num_seq)]) it looks like each video is separated into sequences as: sequence one: 1, 2, 3, 4, 5, 6, 7, the center frame is the 4-th frame. sequence two: 8,9,10,11,12,13,14, the center frame is the 11-th frame. ...

ryanxingql commented 2 years ago

Yes, these training sequences can also be non-overlapping, since the MFQEv2 dataset is big enough to generate enough sequences.

While in test, the input sequences should be overlapping, so as to enhance each frame. But here we do not use LMDB for test.