msracver / Deep-Exemplar-based-Colorization

The source code of "Deep Exemplar-based Colorization".
https://arxiv.org/abs/1807.06587
MIT License
497 stars 75 forks source link

what is the use of the file FLOW? #3

Closed cccusername closed 4 years ago

cccusername commented 6 years ago

hello, could you please tell me the file flow under Deep-Exemplar-based-Colorization-master\demo\example how to use? Thanks you! @hmmlillian @cddlyf

cddlyf commented 6 years ago

You need to use flow to generate error maps, and the similiarity_combo.exe will read the flow files under the flow subfolder. please refer to the instruction of step2:

similarity_combo.exe [MODEL_DIR] [INPUT_ROOT_DIR] [START_LINE_ID] [END_LINE_ID] [GPU_ID] e.g., exe\similarity_combo.exe models\similarity_subnet\ example\ 0 2 0

cccusername commented 6 years ago

@cddlyf I replace the picture under demo\example\input with my own picture and run the command _exe\similarity_combo.exe models\similaritysubnet\ example\ 0 2 0 . But the flow files do not update. What should I do to generate the new flow?

hmmlillian commented 6 years ago

We use "Deep Image Analogy" to generate the flow files as default, and thus the file format we support in similarity_combo.exe is the same as "Deep Image Analogy", which is a text file storing flow field.

The flow field is stored in row major order and each line contains a horizontal (u) and a vertical (v) flow component for each pixel as follows: 0 5 -1 2 3 -2 ...

Therefore, you can replace "Deep Image Analogy" with other dense correspondence estimation algorithms and output flow files in the same format used for similarity_combo.exe.

hmmlillian commented 6 years ago

@cccusername You need to generate flow files using either "Deep Image Analogy" or any other matching algorithms before executing similarity_combo.exe.

Note that the flow files should be named by concatenating the names of the gray-scale target image and the reference color image with an underline "_".

For example, if the target is "in1.jpg" and the reference is "ref1.jpg", the forward flow file should be "in1_ref1.txt" and the backward should be "ref1_in1.txt".

cccusername commented 6 years ago

@hmmlillian Thanks you! when i run "Deep Image Analogy" ,a new error come up. Check failed: status == CUDNN_STATUS_SUCCESS (8 vs. 0) CUDNN_STATUS_EXECUTION_FAILED default the version of my cuda is 8.0, cudnn is 5

hmmlillian commented 6 years ago

@cccusername The problem may be caused by unmatched cudnn or cuda. Did you use the executable program by us or compile it by yourself?

cccusername commented 6 years ago

@hmmlillian the executable program use by yours

hmmlillian commented 6 years ago

@cccusername First, we suggest to compile "Deep Image Analogy" provided under the folder "similarity_subnet/windows/deep_image_analogy/" in your environment. Meanwhile, we will try to reproduce this problem and repair it.

cccusername commented 6 years ago

@hmmlillian I try to complie "Deep Image Analogy" by myself. But I can`t open the deep_image_analogy.vcxproj under the folder "similarity_subnet/windows/deep_image_analogy/". default

hmmlillian commented 6 years ago

@cccusername I just downloaded the whole project and tested it but I did not meet this problem. Did you do as follows?

  1. Edit "CommonSettings.props.example" (under similarity_subnet/windows/) to make the cuda version and cudnn path (if used) in it match yours, and save it as "CommonSettings.props" .
  2. Open solution Caffe (under similarity_subnet/windows/) and add deep_image_analogy (under similarity_subnet/windows/deep_image_analogy/) project.
  3. Build Caffe and the project deep_image_analogy. Hope this is helpful.
cccusername commented 6 years ago

@hmmlillian I do the 3 steps as above, but it doesn`t work. Then I download the project from "Deep-Image-Analogy" and set Ratio=0.2 it works.But the result is pretty bad. Could you tell me your ratio, or how how to disable the time limit(TDR). Thanks a lot!

hmmlillian commented 6 years ago

@cccusername The parameters we use for "Deep Image Analogy" are defined in our Deep-Exemplar-based-Colorization/similarity_subnet/windows/deep_image_analogy/source/main.cpp and we set Ratio=1. And for efficiency, we skip the matching for the finest layer relu1_1 (which will not degrade the matching quality much). We suggest to downscale the shorter edge to 256 pixels if you find it too costly.

To disable TDR, you may change the TDR-related registry keys like this https://docs.microsoft.com/en-us/windows-hardware/drivers/display/tdr-registry-keys, or use NVIDIA Nsight Development Platform (https://docs.nvidia.com/gameworks/content/developertools/desktop/timeout_detection_recovery.htm) if you use NVIDIA GPU on Windows.

Thanks!

cccusername commented 6 years ago

@hmmlillian Thank you very much! I have solved this problem!

cccusername commented 6 years ago

@hmmlillian Sorry to disturb you again. Could you please release the training model? I will very appreciate that if you can tell me, thank you!

hmmlillian commented 6 years ago

@cccusername I not sure what kind of training models you need. Do you mean our pre-trained model "vgg19_bn_gray_ft_iter_150000.caffemodel" for Similarity Subnet and "example_net.pth" for Colorization Subnet, or any other model?

cccusername commented 6 years ago

I mean "example_net.pth" for Colorization Subnet. Thank you! @hmmlillian

hmmlillian commented 6 years ago

@cccusername I think you can download our pre-trained model from the following links: (1) Go to demo\models\similarity_subnet\vgg_19_gray_bn\ folder and download: https://www.dropbox.com/s/mnsxsfv5non3e81/vgg19_bn_gray_ft_iter_150000.caffemodel?dl=0 (2) Go to demo\models\colorization_subnet\ folder and download: https://www.dropbox.com/s/ebtuwj7doteelia/example_net.pth?dl=0 Please try. Thanks!

cccusername commented 6 years ago

@hmmlillian I very sorry for my blurry expression. I mean the training code for Colorization Subnet. Thanks!

hmmlillian commented 6 years ago

@cccusername For some patent reason, we are not planning to release the training code for now. If you are interested in more implementation details, we are very glad to discuss with you.

cccusername commented 6 years ago

Thank you! Firstly, let`s assume only Chrominance branch in Colorization sub-net. I am confused about the Chrominance branch of Colorization sub-net. What does the "Tab“ represent for? If ”Tab(p)“ represent for the colorized picture after reference, could I regard "Tab" as a label?

hmmlillian commented 6 years ago

@cccusername In our Chrominance branch, T_ab is the ground truth, (i.e., the chrominance of the original color target image), and P_ab is the colorized result with the guidance of T'_ab (T'_ab is the ”fake” reference warped using ground truth T_ab based on bidirectional mapping functions). Chrominance loss is defined to measure L1 distance between P_ab and T_ab. We assume that correct color samples in T'_ab (those are similar with T_ab and lead to smaller distance) are very likely to lie in the same positions as correct color sample in R'_ab (R'_ab is the "real" reference warped using reference chrominance R_ab).

cccusername commented 6 years ago

@hmmlillian

  1. According to your paper, P^T_ab is the colorized result with the guidance of T`_ab. If I ignore the Perceptual branch, could I think P^T_ab is equal to P_ab?
  2. As you said, T_ab is the ground truth, and Chromiance loss is defined to measure L1 distance between P_ab and T_ab. Chould I think the aim of Chrominance loss is to make P_ab(or P^T_ab) closed to T_ab?
hmmlillian commented 6 years ago

@cccusername Yes, I think your understanding is correct.

cccusername commented 6 years ago

@hmmlillian Thanks for your patience! I still don`t understand how the Chromiance loss could make the gray picture colorized with the guidance of reference. For example, given two cars, the input one is red and the reference one is blue. After the Chromiance branch, we want the input car(gray picture) become blue. But the Chromiance loss is to make the output closed to its original color. I regard my understanding has some contradictions.

cccusername commented 6 years ago

@hmmlillian Hello, If possible, Could you please release the training model for Chrominance branch only(α=0)?

hmmlillian commented 6 years ago

@cccusername You may download the model via the following link: https://drive.google.com/file/d/19KCRoUSiwg0SELUT8bXzjpPS9sXz2X4L/view?usp=sharing

For your previous question, in the training stage, the input is the aligned ground truth (neither the original ground truth nor the aligned true reference) which is used to guide the following colorization, while in the inference stage, the input is replaced with the aligned true reference. In another word, the Chrominance branch does not remember the ground truth chrominance but obtains one ability to transfer the chrominance from the input by learning from a large-scale dataset.

cccusername commented 6 years ago

Thank you! @hmmlillian So the input is different between training stage and inference stage? What is the aligned ground truth and how to get it?

hmmlillian commented 6 years ago

@cccusername The inputs are different between Chrominance branch and Perceptual branch. In Chrominance branch, the input is the aligned ground truth image which is obtained by warping the ground truth image with bidirectional mapping functions. Please refer to the paper for more details.

cccusername commented 6 years ago

@hmmlillian Thanks again! I found the input is warped_ba in your testing code. So, in the training step, the input of Chrominance branch is warped_aba, the input of Perceptual branch is warped_ba. Is this right?

hmmlillian commented 6 years ago

@cccusername Yes, actually the input is always an aligned "reference b". In the testing stage, the "reference b" is the real reference b (b=b). In the training stage, the "reference b" is the real reference b (b=b) for Perceptual branch, while b is a "fake" reference warped_ab (b=warped_ab, which is reconstructed with the ground truth chrominance) for Chrominance branch.

cccusername commented 6 years ago

@hmmlillian Thank you! You are so nice!