Closed emaadparacha closed 3 years ago
It is probably just that the hOCR implementation that is returned by the custom hOCR skill doesn't support non-horizontal text. I unfortunately don't know a lot of the details of that implementation, but you could see if there can be an adjustment made to support non-horizontal text as well. The OCRSkill should still return the bounding boxes, we just probably don't translate that well enough to the HTML for the text highlighting via hOCR,
Gotcha. Was the custom hOCR skill taken from https://github.com/Azure-Samples/azure-search-power-skills/blob/master/Vision/HocrGenerator ? That way maybe a deeper dive could be done to adjust to support non-horizontal text
The JFK sample actually came first and enough people were interested in hOCR that we ported the skill over to the power skills repo that you linked as well. So any additional solution you may want to create to add the non-horizontal capability would probably be preferred there first but we would also probably want to implement it here.
Closing due to inactivity
When I search for a specific text, usually the results come with that text highlighted on the specific document. I understand this is achieved with hOCR, but it only highlights text that is horizontal. How can I have highlight enabled (or is it possible) if there is slanted text or sideways text? I can still search for that text, and it shows in the transcript, but there is no highlight on the document itself. Is that possible?