The generated results look bad

PeiChiChen commented 1 month ago

Hello, thanks for your work again.

I downloaded your results and found that there are many results not as good as the figures shown in your paper. What's more, many results look not realistic, even not camouflaged at all. For example, the results below seem abnormal, and these bad results account for a large proportion. Could you give some explaination or illustration about this situation? Hope for your reply! Thanks!

SOD_THUR15K_Giraffe711 SOD_THUR15K_DogJump3077 SOD_SOD_69015

PanchengZhao commented 1 month ago

Thank you for your question.

The two subsets SO and GO we set to explore the performance of our method on open categories, even if the model does not perform very well on these data: (1) As mentioned in the Quantitative analysis of the paper, the results of the three subsets have a stepwise distribution, indicating that the model performance is strongly influenced by the image domain gap. Overall there is a higher percentage of successful results on the CO subset, gradually decreasing on the other two subsets. This is due to the fact that only training data from COD10K (78 categories) is used, and thus cannot handle image categories in fully open scenes well, which is difficult and is a current challenge for the entire community. (2) We randomly sampled the dataset when we collected these subsets, so the foreground categories in the dataset are completely open, and many objects are difficult to camouflage even when thinking from the perspective of the human brain by the background replace framework. These categories are difficult samples for the model and account for a large percentage. (3) Diffusion itself generates a diversity of results, so there will be differences in the results generated multiple times for the same input. A better solution is to generate a set of results and then filter to keep the optimal one, which is common in the generative domain. However, considering the cost of time and the fairness of the comparison, we give the results obtained by a single inference of the model and do not further apply the screening strategy, which will make the results look worse.
Our approach provides a new paradigm for camouflaged image generation and suggests solution ideas from a texture perspective. However, there is currently no recognized way to measure the degree of camouflage within this community, and the model learns to camouflage more indirectly from image distributions, lacking a direct constraint, which may lead to certain results failing to camouflage. In addition, realism relies on the background being a recognizable scene (e.g., room, forest, lawn, etc.), and the model does not provide additional information about the foreground or the background to generate a specific scene, so some unrealistic results can occur.
In general, the main reasons for the above issues are: (1) The method is not applicable to the objects of a specific situation. (2) Randomly sampled datasets with open categories are very challenging. (3) Challenges faced by the domain itself: restricted data categories, inability to define the degree of camouflage, etc. We are in the process of exploring and researching further on these issues. If you have any insights, please feel free to discuss them with me.

PeiChiChen commented 1 month ago

Very appreciate your detailed explaination!

My research is also related to camouflaged image generation, and this gives me some inspiration. I think stable diffusion has learned many categories of objects including those in open scene. So maybe it is a direction to explore the power of zero-shot ability of stable diffusion in open scene camouflage image generation!

Besides, in the second point of your explaination, "there is currently no recognized way to measure the degree of camouflage". There is a paper from ICCV2023 "The Making Breaking of Camouflage" https://arxiv.org/abs/2309.03899 , which proposes three scores for automatically assess the effectiveness of camouflage. Perhaps it can be provided to you for reference.

Thanks again for your reply!

PanchengZhao commented 1 month ago

Thanks for the discussion.

Actually, I have read the paper “The Making Breaking of Camouflage” carefully, and it has also brought me a lot of inspiration. I emailed the authors last year and requested the code, which was recently released in CAMEVAL. Thanks to the authors for contributing to this community.

In addition, what I emphasize is "recognized" or "generic". It can be observed that the previous research on the camouflage image generation task did not establish a complete benchmark, and the comparison was mainly made by visualization. Our work attempts to make a quantitative comparison, but the applicability of the generalized FID and KID on camouflaged images remains to be further explored.

“The Making Breaking of Camouflage” creatively proposed 3 metrics, but two reasons limit their widespread use.

On the one hand, the code is not fully released, and until now there is still a lack of code for Probabilistic scoring functions, which makes it difficult to use it to build a benchmark for the task of camouflaged image generation.

On the other hand, measuring camouflage is challenging, and many more factors affect the degree of camouflage, such as object size, number, etc. Therefore, it is a long way to build a recognized camouflage evaluation metric.

PeiChiChen commented 1 month ago

Got it! It helps me a lot. Thank you for your help. Please allow me to reach out to you again if I have further questions~

PanchengZhao commented 1 month ago

If there are no more questions, this issue will be closed. Please feel free to discuss new questions about this work or any insights in this community with me! More Contacts： Email: zhaopancheng@mail.nankai.edu.cn WeChat: zpc972324913

PeiChiChen commented 1 month ago

OK, thanks!

PanchengZhao / LAKE-RED

The generated results look bad #3