jusiro / FLAIR

FLAIR: A Foundation LAnguage-Image model of the Retina for fundus image understanding.
https://jusiro.github.io/projects/flair
Apache License 2.0
72 stars 7 forks source link

the output of the same input image is different #4

Closed veryHope closed 4 months ago

askerlee commented 2 months ago

Yeah the result is drastically different; it looks almost like random. How we are supposed to interpret the results? CS58509_R2

For this image, 3 different runs: Image-Text similarities: [[ 2.438 0.951 -1.828 -1.884 2.602 -1.934 0.96 0.289 -0.282]] Probabilities: [[0.35 0.079 0.005 0.005 0.413 0.004 0.08 0.041 0.023]]

Image-Text similarities: [[ 1.853 -0.314 0.856 -2.382 2.634 0.901 2.705 2.863 2.076]] Probabilities: [[0.096 0.011 0.035 0.001 0.21 0.037 0.225 0.264 0.12 ]]

Image-Text similarities: [[ 4.129 4.457 -2.386 -2.121 2.172 2.341 0.292 -0.185 2.203]] Probabilities: [[0.347 0.482 0.001 0.001 0.049 0.058 0.007 0.005 0.051]]

The textual labels are: ["normal", "healthy", "macular edema", "diabetic retinopathy", "glaucoma", "macular hole", "lesion", "lesion in the macula", "myopia"]