microsoft / LayoutGeneration

MIT License
128 stars 18 forks source link

Just curious about the evaluation of content-aware generation (PosterLayout Dataset) in repo 'LayoutPrompter' #39

Closed yangtao2019yt closed 9 months ago

yangtao2019yt commented 10 months ago

In the jupyter example you provided (namely https://github.com/microsoft/LayoutGeneration/blob/main/LayoutPrompter/notebooks/content_aware.ipynb) you wrote that raw_path = os.path.join(RAW_DATA_PATH(dataset), split, "saliencymaps_pfpn"), which means that you only consider the result processed by PFPN when transforming the saliency map into the bounding box.

But in the evaluation code provided by PosterLayout (namely https://github.com/PKU-ICST-MIPL/PosterLayout-CVPR2023/blob/main/eval.py), it's like:

pic_1 = np.array(Image.open(os.path.join("Dataset/test/saliencymaps_pfpn", name.replace(".", "_pred."))).convert("L").resize((513, 750))) / 255 pic_2 = np.array(Image.open(os.path.join("Dataset/test/saliencymaps_basnet", name)).convert("L").resize((513, 750))) / 255 pic = np.maximum(pic_1, pic_2)

, which means that they consider both the saliency map processed by PFPN and BasNet for evaluation.

In the result you provided, you got quite an impressive trade-off between the Utility score and the Occlusion score. But my question is, you know, since there is some difference between these two groups of masks, is it possible for your model to avoid overlapping BasNet saliency region when just providing the bounding box extracted from PFPN?