Closed SuperCrystal closed 3 years ago
@SuperCrystal Hi, sorry for the late response. Empirically, testing images with their original size usually delivers better performance than cropping an image patch with 384 x 384 in our experiments.
@zwx8981 Thanks a lot for your response! Another question is that if std_modeling is False and therefore p = y_diff = y1 - y2 by the code you provided. When I use it in this way, the loss just can not converge. However, if a sigmoid function is used, it can work correctly. Wondering whether you have also test about this. It is really an impressive work anyway :)
Yes,when you set std_modeling to false,you should change the loss function to BCEwithlogit,which inherently contains a sigmoid function. You may also still use the fidelity loss, in which case you should manually append a sigmoid function as you said or fix the std to a constant, say 1, and convert the diff into a probability using the same equation (Thurstone model). Essentially, fidelity loss is used to measure the difference between two probability distributions, so you should first convert the original difference logit into a probability.发自我的iPhone------------------ Original ------------------From: Crystal @.>Date: Wed,Jun 2,2021 7:51 PMTo: zwx8981/UNIQUE @.>Cc: Subscribed @.***>Subject: Re: [zwx8981/UNIQUE] About the input size in training and testingtime (#13) @zwx8981 Thanks a lot for your response! Another question is that if std_modeling is False and therefore p = y_diff = y1 - y2 by the code you provided. When I use it in this way, the loss just can not converge. However, if a sigmoid function is used, it can work correctly. Wondering whether you have also test about this. It is really an impressive work anyway :)
—You are receiving this because you are subscribed to this thread.Reply to this email directly, view it on GitHub, or unsubscribe. [ { @.": "http://schema.org", @.": "EmailMessage", "potentialAction": { @.": "ViewAction", "target": "https://github.com/zwx8981/UNIQUE/issues/13#issuecomment-852961965", "url": "https://github.com/zwx8981/UNIQUE/issues/13#issuecomment-852961965", "name": "View Issue" }, "description": "View this Issue on GitHub", "publisher": { @.": "Organization", "name": "GitHub", "url": "https://github.com" } } ]
In the paper, it is said that during training the input size is set to 384 x 384 for all the images from all the databases, while that during the test, the network will inference on the original size. What if the test size is also 384 x 384? Will this affect the performance?