Initiated by the University of Michigan Herbarium, VoucherVision harnesses the power of large language models (LLMs) to transform the transcription process of natural history specimen labels.
Since the OCR platforms return a lot of information that we could use outside of VV, it would be great if they were included in the JSON results returned by VV. In particular I'd like to have information like which OCR engine and version was used, the coordinates of lines, words, and characters, and the confidence score reported for each. And I just realized that these coordinates would probably be based on the VV collage image so it would be nice if we also received the coordinates of the sub-images that were extracted from the original image so we can reconstruct the OCR coordinates.
Since the OCR platforms return a lot of information that we could use outside of VV, it would be great if they were included in the JSON results returned by VV. In particular I'd like to have information like which OCR engine and version was used, the coordinates of lines, words, and characters, and the confidence score reported for each. And I just realized that these coordinates would probably be based on the VV collage image so it would be nice if we also received the coordinates of the sub-images that were extracted from the original image so we can reconstruct the OCR coordinates.