Dear authors of InstructBLIP, when I was reading the code of InstructBLIP, I found that the text processor transforms most of the input into lower case, and the outputs of model are all in lower case. And most of the punctuations were omitted. Could I know the core insight of such preprocessing? Thanks!
Dear authors of InstructBLIP, when I was reading the code of InstructBLIP, I found that the text processor transforms most of the input into lower case, and the outputs of model are all in lower case. And most of the punctuations were omitted. Could I know the core insight of such preprocessing? Thanks!