Open mhasnat opened 1 year ago
Hi, I would be as well very interested on how to do an inference on a single image. I encounter as well quite few difficulties to extract just the part of the code that does that.
Thanks!
Hi, Thank you for your interest in our work. We are planning to release a jupyter notebook to do inference on a single image. As for you question, the logit is the output of the network which indicates if it considers the image as synthetic or real. To determine which one the network is suggesting, you need to fix a decision threshold. You can fix one threshold for all architectures, or, as it is highlighted in the paper, you can assume to have some images of each architecture and determine the most appropriate threshold for each one using a calibration technique. If you prefer, you can also apply a sigmoid function to the logit to convert it into a probability and then apply a threshold on it, for example a threshold at 0.5.
Thanks for the great work! In the CSV file, can you verify that the label "True" indicates a fake, while "False" signifies a real image? Therefore, if we apply a sigmoid function to the logit to convert it into a probability, does this imply that the resulting probability represents the likelihood of a synthetic image? Thanks!
Thanks for the great work! In the CSV file, can you verify that the label "True" indicates a fake, while "False" signifies a real image? Therefore, if we apply a sigmoid function to the logit to convert it into a probability, does this imply that the resulting probability represents the likelihood of a synthetic image? Thanks!
Yes, all you have said is correct.
@RCorvi Thanks for releasing the code. I had query regarding output logits. In README it is mentioned that if logit is positive then the image is fake. But I observed for dalle_2 most of them are negative, I wanted to know if you used different threshold for images from different generation models?
Hi sorry for the late reply. As can be seen in the code released, we do not change the threshold based on the architecture. As a matter of fact, it can be seen in the paper that the accuracy on Dalle 2 is 50.0.
@mhasnat @jplu Hello, did you manage to perform inference on a single image? If yes, would you mind sharing the code? It seems not to be as straight forward as I thought. Thank you.
Hey @v-v, unfortunately not. Was waiting for the authors to release the single image inference code.
Hi, I apologize if we did not manage to get around to write this code yet. What exactly is the issue that made it difficult to write? In the meantime, if you need to do inference on a single image you can also use as input a csv with just one path inside, the code will produce the csv with the corresponding logit. Regardless, I will try to write it as soon as I find time.
Hello!
@v-v Use my pull request:
https://github.com/grip-unina/DMimageDetection/pull/12
There is a file called single.py
to do single image inference and several updates to make it easier to install with more up-to-date dependencies.
Hi, while we do appreciate the work done in the pull request, this repository contains the code of a published paper for the sake of reproducibility of the results. As such @v-v , if you are interested to test our code for research purposes and you want to compare the results to our methodology, please use the code from the main repository and not the code from the pull request. Thank you.
Hi,
thank you for the great work and uploading the code. I was wondering if you can provide a notebook to perform inference on single image.
In fact, I tried to to do the inference in a notebook and for a single image I observed that:
However, at this point I am confused how to further get the detection results from the logit value. Please correct me if I am wrong.
At this point, it will be really helpful if as an author you can provide the notebook for the inference. I believe it will be really helpful. If it is not possible, then please answer/clarify the above questions so that I can prepare the notebook and share.
Thank you.