How to predict punctuation in the image captioning?

salesforce / LAVIS

LAVIS - A One-stop Library for Language-Vision Intelligence

BSD 3-Clause "New" or "Revised" License

9.33k stars 924 forks source link

How to predict punctuation in the image captioning? #516

Open HiLittleFriend opened 11 months ago

HiLittleFriend commented 11 months ago

Hi, thank you for your wonderful work. I am trying to predict paragraphs (several sentences) using model finetuned on my custom datasets. The result is like "a white house with blue trim is sitting on the street the house has a front porch with stairs leading up to it the house has three windows on the front and two on the side". So how to predict punctuation when generating the text, is there any options.

shams2023 commented 9 months ago

Hi, thank you for your wonderful work. I am trying to predict paragraphs (several sentences) using model finetuned on my custom datasets. The result is like "a white house with blue trim is sitting on the street the house has a front porch with stairs leading up to it the house has three windows on the front and two on the side". So how to predict punctuation when generating the text, is there any options.

你好！请问你列举的这个文本例子，是使用模型生成的吗？还是数据集自带的描述性文本？

pxiangwu commented 8 months ago

The missing of punctuation symbols could be due to this piece of code:

https://github.com/salesforce/LAVIS/blob/main/lavis/processors/blip_processors.py#L49-L54