Open minostauros opened 6 months ago
I have migrated the CogVLM-chat-hf to mindspore and found the modle works well when the input including image and text, but if there is only text query without image, the performance is not so good, it may relate to this issue I guess.
I have migrated the CogVLM-chat-hf to mindspore and found the modle works well when the input including image and text, but if there is only text query without image, the performance is not so good, it may relate to this issue I guess.
@antigone660 In text-only mode, the prompt template is different. Did you use the following prompt for text-only query?
In my case, text-only mode works well regardless of this issue
@minostauros Thanks for your reply I did not use the template before and it works now : )
So, the LM had also been tuned? I also found that the LM weights of cogvlm2 are different with llama-3-8B-Instruct.
While CogVLM is trained, LM weights are fronzen.
From my observation however, the LM weights of cogvlm are different with Vicuna
Vicuna: https://huggingface.co/lmsys/vicuna-7b-v1.5/tree/main CogVLM: cogvlm-chat-v1.1 (both from HF or SAT)
Can I ask why or the proper source of the language model?