ZZZHANG-jx / DocRes

[CVPR 2024] DocRes: A Generalist Model Toward Unifying Document Image Restoration Tasks
MIT License
287 stars 26 forks source link

Why the dewarping_loss uses two channel? #8

Open xuewengeophysics opened 2 months ago

xuewengeophysics commented 2 months ago

@ZZZHANG-jx Hi, Jiaxin! Thanks for your great work. I have a question why the dewarping_loss uses two channel?

dewarping_loss = l1(pred_im[:,:2,:,:], gt_im[:,:2,:,:])

Thank you!

ZZZHANG-jx commented 2 months ago

Thank you for your question. Unlike the other four tasks that output RGB images, the direct output of the dewarping task is a 2-channel flow field. These two channels indicate the displacement in the x and y directions, respectively. Based on this flow field, the dewarped RGB image can be sampled, which is the usual practice in document dewarping tasks. Therefore, we only need to use two of the three output channels and can ignore the remaining channel.