Open hbloom1783 opened 4 years ago
Notes on each step:
Ingestion methods include:
Ingesting an image means blowing away the OCR data. It doesn't necessarily mean blowing away the phrasing data, though it will naturally blank their text until OCR is back.
OCR methods include:
We may be able to implement image filters, which would have to be applied to the image before submitting for OCR.
Phrasing methods include:
Phrase rectangles don't have any meaning before OCR is performed. If they're persisting from a previous OCR, they shouldn't be editable or displayed until OCR is complete again, at which point they need to re-load their underlying text and re-submit for translation.
Translation is always automatic for each phrase rectangle. This process is nicely simple and low-level,
So as far as states go, I see a need for these:
This is our initial state, and also the state immediately after ingestion. Phraserects, if we have them, are invisible. From here we can:
If we enter this state with AutoOCR on, and an image set, then we need to immediately send that image for OCR.
We enter this state immediately after OCR is complete. We can now display/manipulate Phraserects. If we enter this state with Phraserects, we need to re-load their text and re-submit them for translation. From here we can:
Having written that out, it seems like most of the modalities of the GUI (disabling the OCR button while the OCR request is ongoing, displaying/hiding the viewfinder, etc) don't actually affect the core state of the app, and only two states to govern this are really needed: Either we have an OCR, or we don't.
Going to separate this into a couple of posts to try to keep it readable.
So the basic flow of the app as it stands is this: