Closed mneedham closed 10 months ago
Might be interesting to do a video showing what types of things the open source multi modals can't do very well. We could use GPT4-V as the ground truth as it seems to be able to handle pretty much any images.
I kinda did this in my latest video.
Might be interesting to do a video showing what types of things the open source multi modals can't do very well. We could use GPT4-V as the ground truth as it seems to be able to handle pretty much any images.