guanghuixu / AnchorCaptioner

Other
32 stars 10 forks source link

多条描述语句生成 #14

Closed 1530537501-jst closed 1 year ago

1530537501-jst commented 1 year ago

感谢分享您的代码,请问我运行代码发现只生成了一条图像描述语句,请问需要修改什么配置文件才能看到多条描述语句呢?

PAradoxLG commented 1 year ago

@1530537501-jst Hi,do you save the problem?I want to make more captions but can't find parameters to make it.Looking forward to your reply

guanghuixu commented 1 year ago

从模型的anchor预测结果中,选择不同的top-5,这样就能构造不同的anchor-graph,基于这些anchor-graph就能生成不同的caption

PAradoxLG commented 1 year ago

@guanghuixu Thank you for your response. What specific modifications do I need to make? If I modify the parameter "evalai_inference," can it generate multiple descriptions? image

guanghuixu commented 1 year ago

evalai_inference just indicates the test status. To generate multiple captions, you need to modify the code of Inference as I said above. Our paper still wants to have better accuracy than other methods in terms of beam=1, and diversity is just an additional benefit of our implementation method. This part is our ablation experiment, so we don't put the code directly

guanghuixu commented 1 year ago

Don't use argmax,which only return top-1 anchor https://github.com/guanghuixu/AnchorCaptioner/blob/main/pythia/modules/gpn.py#L51

PAradoxLG commented 1 year ago

@guanghuixu 您的意思是去掉这句话就可以实现多条描述的生成是吗(感谢GhatGPT让我可以用中文与您交流)

guanghuixu commented 1 year ago

单纯去掉不行,后续代码也需要改 这里armax的意思是只返回top-1的idx,你可以用top-k把前面几个的idx都取出来 但取出来后返回的archor变多,concat feature的时候会有问题 比如你原来是一个anchor,concat visual feature 现在k个的话对应的visual feature也需要复制多份

guanghuixu commented 1 year ago

还有一种简单的做法,直接跑多次 第一次取第一个anchor 第二次取第二个anchor 第三次取第三个anchor ... 在后处理的时候把一张图片所有anchor的结果拼起来

PAradoxLG commented 1 year ago

好的我会尝试一下您的方法再次感谢您的回复 希望以后也能见到您的作品