It seems like GPT4V is getting stuck in a text-only mode when the first user message doesn't contain any images attached to it. Meaning it refuses to recognize images in follow-up questions.
This is a GPT4V feature which we could do nothing about.
However, for the DIAL Chat user it may be quite confusing.
We may add a warning stage saying something like this:
Warning:
Keep in mind, that GTP4 Vision model expects image(s) in a first user message.
Otherwise, it's going to work as a text-only model.
It seems like GPT4V is getting stuck in a text-only mode when the first user message doesn't contain any images attached to it. Meaning it refuses to recognize images in follow-up questions.
This is a GPT4V feature which we could do nothing about.
However, for the DIAL Chat user it may be quite confusing. We may add a warning stage saying something like this:
in order to highlight this GPT4V feature.