ACM Multimedia 2023: DocDiff: Document Enhancement via Residual Diffusion Models. Also contains 1597 red seals in Chinese scenes, along with their corresponding binary masks.
I tried to test your model and got an inference in Colab, but I got an error:
File "/content/DocDiff/model/DocDiff.py", line 315, in forward x = torch.cat((x, s), dim=1) RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 440 but got size 439 for tensor number 1 in the list.
# model
IMAGE_SIZE : [128, 128] # load image size, if it's train mode, it will be randomly cropped to IMAGE_SIZE. If it's test mode, it will be resized to IMAGE_SIZE.
CHANNEL_X : 3 # input channel
CHANNEL_Y : 3 # output channel
TIMESTEPS : 100 # diffusion steps
SCHEDULE : 'linear' # linear or cosine
MODEL_CHANNELS : 32 # basic channels of Unet
NUM_RESBLOCKS : 1 # number of residual blocks
CHANNEL_MULT : [1,2,3,4] # channel multiplier of each layer
NUM_HEADS : 1
MODE : 0 # 1 Train, 0 Test
PRE_ORI : 'True' # if True, predict $x_0$, else predict $\epsilon$.
# test
NATIVE_RESOLUTION : 'False' # if True, test with native resolution
DPM_SOLVER : 'False' # if True, test with DPM_solver
DPM_STEP : 20 # DPM_solver step
BATCH_SIZE_VAL : 1 # test batch size
TEST_PATH_GT : '/content/drive/MyDrive/wight/data/' # path of ground truth
TEST_PATH_IMG : '/content/drive/MyDrive/wight/data/' # path of input
TEST_INITIAL_PREDICTOR_WEIGHT_PATH : '/content/drive/MyDrive/wight/init_predictor_document_deblurring.pth' # path of initial predictor
TEST_DENOISER_WEIGHT_PATH : '/content/drive/MyDrive/wight/denoiser_document_deblurring.pth' # path of denoiser
TEST_IMG_SAVE_PATH : './results'
Hello, as mentioned in Guide in README.EN.md, please make sure that the width and height of the input image are both multiples of 8. So [1654, 2339, 3] needs to be padded to [1656, 2344, 3].
I tried to test your model and got an inference in Colab, but I got an error:
File "/content/DocDiff/model/DocDiff.py", line 315, in forward x = torch.cat((x, s), dim=1) RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 440 but got size 439 for tensor number 1 in the list.
My conf.yml
Input image has size [1654, 2339, 3]
Could you help me solve the problem?