Screenshot to structured data

@kaihendry Thanks for submitting this issue!

By default, the gpt4-v-vision tool uses OpenAI's gpt-4-turbo model (previously known as gpt-4-vision-preview) to interpret images. Skimming through the OpenAI docs, I didn't see anything mentioning OCR-related limitations specifically, but I did find a community thread where folks were encountering similar issues. In that thread it looks like it has become increasingly difficult to get decent OCR results via OpenAI's API and model. At the moment, it's unclear to me what OpenAI's official level of support is for OCR

We always have the option of writing another vision tool for a non-OpenAI model if we can find one with better OCR support too.

In the meantime, when I get the chance I'll try to repro your issue.

gptscript-ai / gpt4-v-vision

Screenshot to structured data #11