I am puzzled as the quickstart (judging from the code) does up to 10 (up to 50 with prompt caching) images. How do they fit? Am I doing something completely wrong?
@lostmsu your content block is implicitly being processed as text rather than an image. See here for an example of how to properly structure your image tool result.
I actually have a custom implementation of the tool. But here's the JSON I submit:
computer-use.json
It is a simple 1024x768 PNG.
I am puzzled as the quickstart (judging from the code) does up to 10 (up to 50 with prompt caching) images. How do they fit? Am I doing something completely wrong?