kingsdigitallab / kdl-vqa

Python tool for batch visual question answering (BVQA).
MIT License
1 stars 0 forks source link

API to run task over series of (response, image) pairs #14

Open geoffroy-noel-ddh opened 1 week ago

geoffroy-noel-ddh commented 1 week ago

e.g. execute rotation of the image based on answer by model to question about orientation.

or crop region, split double pages, etc.