Integrate Azure Computer Vision for OCR text generation for uploaded files

jeffpaul commented 5 years ago

Is your enhancement related to a problem? Please describe. Azure's Computer Vision is able to read both printed and handwritten text in images. One potential use case here would be for academics who tend to upload large PDF files or documents to WordPress for storage, ClassifAI could then utilize OCR to generate a related Post or Page for each PDF/document uploaded.

Describe the solution you'd like If this concept gains positive feedback, then we'll want to iterate on it to build out more specific requirements before we dive into development on this.

Designs n/a

Describe alternatives you've considered none

Additional context none

helen commented 4 years ago

Here's my planning doc, I can transfer more of the information over a little later: https://docs.google.com/document/d/1foLApzfls-IPtXSC8mM8NJObib9njlGs24HPt0PDOBk/edit?usp=sharing

helen commented 4 years ago

Thinking through some flows here so we don't end up annoying users with always inserting the text or duplicating descriptive text:

When inserting from the media library modal, the user should be able to opt into inserting the description with the image. This is maybe a checkbox next to the "Select" button at the bottom like "Insert description as text" (can change wording later).
When dragging and dropping and image into the block editor, we need to have a separate flow for inserting the description. I think we could ship this initially with a toolbar button because it's a trained workflow for now and consider working toward some kind of dismissible notification overlay on the image that says "OCR text detected; do you want to add the text to your content?" later.

Both methods need to add an aria-describedby attribute to the img element that references an ID assigned to the inserted text. We may need to consider making the inserted text a group or something so that any editing that adds paragraphs doesn't end up disrupting it - I would need to test that directly to decide though. We could also consider doing it as a poem block or something where line breaks don't start a new block.

I think the ID should be something like description-attachment-##### where ##### is the attachment post ID. We can then also be a little smart knowing that naming and detect the following:

When updating an image block, if a description block with that same ID already exists, don't show the prompt to insert the description.
If the image is deleted, indicate that the block is no longer attached to an image (red border or something). The user can then go to the "anchor" field to edit that association. I don't think this is particularly important for initial ship.

Is there anything else that needs to be covered here?

dinhtungdu commented 4 years ago

@helen In https://github.com/10up/classifai/pull/228/commits/b6df398df122883c2d50fab378ead7501d3e0df7 I added:

aria-describedbytag to image usingthe_content` filter.
a modal poped up when the newly inserted image has text that's detected by Azure (for all three types of uploads: drag, media library and upload).
a button in the setting sidebar to manually insert the description.

Note:

I feel modal is a little bit annoying, I may try leveraging notice to replace it.
The description is a paragraph currently, this need to be updated.

helen commented 4 years ago

@dinhtungdu I left an implementation comment on the PR about when to show a prompt (only when there are actually ClassifAI OCR results, not just any description). I agree the modal is pretty intrusive, if we can figure out a notice that seems ideal, even better if it can be positioned over the image I think since presumably your visual focus is there.

Let's not show the the setting sidebar button if there are no OCR results. We might want to refine that logic later in case somebody decides to incorporate a workflow where they're manually adding descriptions to images that they also want inserted the same way, but again, for a first run I think that's a better way to ramp into the feature.

10up / classifai

Integrate Azure Computer Vision for OCR text generation for uploaded files #111