fh2019ustc / DocTr-Plus

The official code for “Deep Unrestricted Document Image Rectification”, TMM, 2023.
https://project.doctrp.top/
MIT License
375 stars 40 forks source link

How to get the ground truth? #6

Open Moonflynn opened 1 year ago

Moonflynn commented 1 year ago

The loss function is defined as the L1 distance between the predicted warping flow fb and its given ground truth f{gt}. However, how is the ground truth obtained? In the link UDIR test set, there are only original and scanned images, but there is no ground truth matrix of pixel-wise corresponding between the two images.

fh2019ustc commented 1 year ago

Hi, we train our network using the Doc3D dataset and then evaluate it using the UDIR dataset. The Doc3D dataset contains the ground truth warping flow between the distorted and scanned images.

Moonflynn commented 1 year ago

Hi, we train our network using the Doc3D dataset and then evaluate it using the UDIR dataset. The Doc3D dataset contains the ground truth warping flow between the distorted and scanned images.

Thanks! I have some other questions about Doc3D dataset.

  1. Dose 'bm' files store backward warping flow and 'uv' files store forward warping flow?
  2. The size of image and bm is 448*448. If I want to resize to 512*512, I can use cv2.resize(image) to resize image, but how can i get the resized warping flow?

Thank you in advance for your time and consideration. If you could kindly provide me with answers to these questions, I would greatly appreciate it.

zhaolitc commented 1 week ago

@Moonflynn Hi,have you got how to resize bm file to new resolution?I have tried to use cv2.resize to process bm file just like RGB image,but when I trained the DocTr, the result is not good. Maybe we should multiply each element value in bm by a resolution-dependent scaling factor?