uakarsh / latr

Implementation of LaTr: Layout-aware transformer for scene-text VQA,a novel multimodal architecture for Scene Text Visual Question Answering (STVQA)
https://uakarsh.github.io/latr/
MIT License
52 stars 7 forks source link

huggingface demo #16

Open mxw20010804 opened 2 months ago

mxw20010804 commented 2 months ago

hello, there is a Runtime error in the hugging face demo. can you fix it? thank you! https://huggingface.co/spaces/iakarshu/latr-vqa