microsoft / OCR-Form-Tools

A set of tools to use in Microsoft Azure Form Recognizer and OCR services.
MIT License
515 stars 174 forks source link

FOTT bug report - UI error with some text-based PDF #990

Open AndreaLeonangeli opened 3 years ago

AndreaLeonangeli commented 3 years ago

Describe the bug The system does not handle properly a specific type of text-based PDF documents. Unfortunately I cannot share the document.

In the Tags Editor of the UI, the document is not loaded from an Azure Blob Storage. The thumbnail is black and keeps spinning.

In the analyze with custom model section, the UI keeps loading the file and nothing happens.

To Reproduce Steps to reproduce the behavior:

  1. Open the analyze with custom model section.
  2. Click on 'Browse for a file' and select the PDF document.
  3. The UI keeps fetching with the spinning icon. The console shows this message "Warning: getOperatorList - ignoring errors during "GetOperatorList: page 0" task: "DataCloneError: Failed to execute 'postMessage' on 'DedicatedWorkerGlobalScope': function at() { [native code] } could not be cloned."

Expected behavior The UI loads the text-based PDF document both in the tagging section and in the analyze page.

Screenshots image

Desktop (please complete the following information):

Additional context I tried with a different PDF of the same type and structure (created in the same way I suppose) and the problem still happens. With a completely different text-based PDF the UI works properly. Note that the PDF that does not work with FOTT, is analyzed properly when using the Form Recognizer REST API directly.