State Machine - Githubissues

hbloom1783 commented 4 years ago

Going to separate this into a couple of posts to try to keep it readable.

So the basic flow of the app as it stands is this:

Ingest an image from someplace.
Submit the ingested image for OCR.
Add phrases. (Either via the autophraser or manually)
Translate phrases.

hbloom1783 commented 4 years ago

Notes on each step:

Ingest

Ingestion methods include:

Viewfinder snaps
Clipboard paste
Load a file from a file picker
Load a file via drag-and-drop

Ingesting an image means blowing away the OCR data. It doesn't necessarily mean blowing away the phrasing data, though it will naturally blank their text until OCR is back.

OCR

OCR methods include:

Automatically OCR as soon as ingestion happens
OCR when a toolbar button is pressed

We may be able to implement image filters, which would have to be applied to the image before submitting for OCR.

Phrasing

Phrasing methods include:

Automatically generating phrases based on the OCR results
Manually dragging phrase rectangles onto the image
Moving, resizing, or combining phrase rectangles

Phrase rectangles don't have any meaning before OCR is performed. If they're persisting from a previous OCR, they shouldn't be editable or displayed until OCR is complete again, at which point they need to re-load their underlying text and re-submit for translation.

Translation

Translation is always automatic for each phrase rectangle. This process is nicely simple and low-level,

hbloom1783 commented 4 years ago

So as far as states go, I see a need for these:

NoOCR

This is our initial state, and also the state immediately after ingestion. Phraserects, if we have them, are invisible. From here we can:

Ingest a new image
Apply filters to the base image
Submit the image to OCR

If we enter this state with AutoOCR on, and an image set, then we need to immediately send that image for OCR.

OCRed

We enter this state immediately after OCR is complete. We can now display/manipulate Phraserects. If we enter this state with Phraserects, we need to re-load their text and re-submit them for translation. From here we can:

Modify Phraserects (any phraserect changed by anything needs to be retranslated)
Ingest a new image or apply filters to our existing image, kicking us back to NoOCR
Re-submit OCR, kicking us back to NoOCR (until the OCR completes and the callback hits)

hbloom1783 commented 4 years ago

Having written that out, it seems like most of the modalities of the GUI (disabling the OCR button while the OCR request is ongoing, displaying/hiding the viewfinder, etc) don't actually affect the core state of the app, and only two states to govern this are really needed: Either we have an OCR, or we don't.

cathoderaydude / Babel

State Machine #37

Ingest

OCR

Phrasing

Translation

NoOCR

OCRed