fh2019ustc / DocGeoNet

The official code for “Geometric Representation Learning for Document Image Rectification”, ECCV, 2022.
Other
76 stars 2 forks source link

a question about prepossessing #2

Closed hanquansanren closed 2 years ago

hanquansanren commented 2 years ago

hello hao, Thanks for your third awesome work for document image dewarping. I have a simple question, that is, In your section 5.2 of the paper , you pointed out that The preprocessing module and the following rectification module are trained independently. as shown in the following:

image My question is why don't you train the whole flow jointly? Have you made any trials to verify the 2-stage rectification would be better?

hanquansanren commented 2 years ago

sorry, the issue title should be preprocessing, instead of prepossessing

fh2019ustc commented 2 years ago

Hi, thanks for your attention to our work, and this is a nice concern. In fact, the whole network can be trained jointly. We perform the 2-stage rectification here based on the following reasons,

  1. The gradients can be backpropagated to the prepossessing network due to the nondifferentiable mask. Because the distorted image should be multiplied by the document mask(0 denotes the background and 1 denotes the foreground document region).
  2. Joint training would consume more GPUs. I hope this can help you.
hanquansanren commented 2 years ago

I see, it helps me a lot, thanks for your kind response