多条描述语句生成 - Githubissues

guanghuixu / AnchorCaptioner

Other

32 stars 10 forks source link

多条描述语句生成 #14

Closed 1530537501-jst closed 1 year ago

1530537501-jst commented 1 year ago

感谢分享您的代码，请问我运行代码发现只生成了一条图像描述语句，请问需要修改什么配置文件才能看到多条描述语句呢？

PAradoxLG commented 1 year ago

@1530537501-jst Hi,do you save the problem?I want to make more captions but can't find parameters to make it.Looking forward to your reply

guanghuixu commented 1 year ago

从模型的anchor预测结果中，选择不同的top-5，这样就能构造不同的anchor-graph，基于这些anchor-graph就能生成不同的caption

PAradoxLG commented 1 year ago

@guanghuixu Thank you for your response. What specific modifications do I need to make? If I modify the parameter "evalai_inference," can it generate multiple descriptions?

guanghuixu commented 1 year ago

evalai_inference just indicates the test status. To generate multiple captions, you need to modify the code of Inference as I said above. Our paper still wants to have better accuracy than other methods in terms of beam=1, and diversity is just an additional benefit of our implementation method. This part is our ablation experiment, so we don't put the code directly

guanghuixu commented 1 year ago

Don't use argmax，which only return top-1 anchor https://github.com/guanghuixu/AnchorCaptioner/blob/main/pythia/modules/gpn.py#L51

PAradoxLG commented 1 year ago

@guanghuixu 您的意思是去掉这句话就可以实现多条描述的生成是吗(感谢GhatGPT让我可以用中文与您交流)

guanghuixu commented 1 year ago

单纯去掉不行，后续代码也需要改这里armax的意思是只返回top-1的idx，你可以用top-k把前面几个的idx都取出来但取出来后返回的archor变多，concat feature的时候会有问题比如你原来是一个anchor，concat visual feature 现在k个的话对应的visual feature也需要复制多份

guanghuixu commented 1 year ago

还有一种简单的做法，直接跑多次第一次取第一个anchor 第二次取第二个anchor 第三次取第三个anchor ... 在后处理的时候把一张图片所有anchor的结果拼起来

PAradoxLG commented 1 year ago

好的我会尝试一下您的方法再次感谢您的回复希望以后也能见到您的作品