Closed 1530537501-jst closed 1 year ago
@1530537501-jst Hi,do you save the problem?I want to make more captions but can't find parameters to make it.Looking forward to your reply
从模型的anchor预测结果中,选择不同的top-5,这样就能构造不同的anchor-graph,基于这些anchor-graph就能生成不同的caption
@guanghuixu Thank you for your response. What specific modifications do I need to make? If I modify the parameter "evalai_inference," can it generate multiple descriptions?
evalai_inference
just indicates the test status.
To generate multiple captions, you need to modify the code of Inference as I said above. Our paper still wants to have better accuracy than other methods in terms of beam=1, and diversity is just an additional benefit of our implementation method.
This part is our ablation experiment, so we don't put the code directly
Don't use argmax
,which only return top-1 anchor
https://github.com/guanghuixu/AnchorCaptioner/blob/main/pythia/modules/gpn.py#L51
@guanghuixu 您的意思是去掉这句话就可以实现多条描述的生成是吗(感谢GhatGPT让我可以用中文与您交流)
单纯去掉不行,后续代码也需要改 这里armax的意思是只返回top-1的idx,你可以用top-k把前面几个的idx都取出来 但取出来后返回的archor变多,concat feature的时候会有问题 比如你原来是一个anchor,concat visual feature 现在k个的话对应的visual feature也需要复制多份
还有一种简单的做法,直接跑多次 第一次取第一个anchor 第二次取第二个anchor 第三次取第三个anchor ... 在后处理的时候把一张图片所有anchor的结果拼起来
好的我会尝试一下您的方法再次感谢您的回复 希望以后也能见到您的作品
感谢分享您的代码,请问我运行代码发现只生成了一条图像描述语句,请问需要修改什么配置文件才能看到多条描述语句呢?