-
The results of LLaVA-v1.5 on the TextVQA benchmark reported in the paper are much lower than those in the LLaVA-v1.5 paper.
-
In Appendix A's Image-text Data Collection, mention "_It is important to note that the
OCR detector is utilized solely for generating enriched data and is not employed during testing_ ". But the text…
-
Hi,
I have trained OCR branch successfully, however when I try to evaluate it, I meet the below error:
`
File "tools/test.py", line 143, in
main()
File "tools/test.py", line 85, in mai…
-
Dear Maintainers,
I'm currently trying to reproduce the zero-shot results of instructblip. The caption of table5 says that for datasets with OCR tokens, the image query embeddings are simply append…
-
### New Feature Summary
We now have a fair number of batches in the `batches` directory, and many of them used in various evaluation effort in `aapb-evaluaions` repo. To find out for which evaluation…
-
I am training PaddleOCR for Tamil language and when I train the model with 70 images for training and 24 images for evaluation.
After 1st epoch the model take more time (More than 5 hrs) on Evaluat…
-
Thanks for creating this package!
As discussed in https://github.com/robertknight/ocrs/issues/14 it would be nice to add some evaluation benchmarks. And maybe optionally compare with tesseract or s…
-
Hi, I have a few questions on OCR evaluation.
1. When evaluating OCR performance on DIR300 dataset(or DocUNet benchmark), the size of the predicted image and GT image are different. I suppose you h…
-
Hello. Thank you for your excellent work.I have some questions about the statements in the paper and hope to receive your answers。In Table 3, you compared the differences between your method and other…
-
Thanks a lot for your excellent job. I wonder how you evaluate the trained model, do you use ./scripts/more/eval/pope.sh, which uses llava.eval.model_vqa_loader for evaluation (seems no modification f…