prasunroy / stefann

:fire: [CVPR 2020] STEFANN: Scene Text Editor using Font Adaptive Neural Network (official code).
https://prasunroy.github.io/stefann
Apache License 2.0
259 stars 40 forks source link

SSIM calculation in fannet #18

Open thuliu-yt16 opened 3 years ago

thuliu-yt16 commented 3 years ago

I appreciate your excellent work on text editing. I tried to run FANnet with pretrained model on your datasets. So I downloaded the pretrained weights from here and datasets from here following README.

To generate results using the valid set as the input, I modified fannet.py and ran the following code

from skimage.metrics import structural_similarity as ssim
for data in valid_datagen.flow():
    [x, onehot], y = data
    out = fannet.predict([x, onehot]) 
    n = x.shape[0]
    for i in range(n):
        _x = x[i].reshape(64, 64)
        _gt = y[i].reshape(64, 64)
        _out = out[i].reshape(64, 64)
        _, _out_bin = cv2.threshold(_out,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
        sv = ssim(_gt, _out_bin, data_range=255, gaussian_weights=True, sigma=1.5, use_sample_covariance=False)
        print(sv)

But the SSIM value sv is far from what was claimed in the paper. I also tried to calculate the average SSIM w.r.t different source characters. There is also a large gap. So I am wondering if there exists some mistakes when running the model or just the SSIM calculation.