aws-samples / amazon-textract-transformer-pipeline

Post-process Amazon Textract results with Hugging Face transformer models for document understanding
MIT No Attribution
88 stars 25 forks source link

Update A2I dependencies and fix font translation in PDF rendering #31

Closed athewsey closed 1 year ago

athewsey commented 1 year ago

Issue #, if available: Fixes #30

Description of changes:

The PDF.js upgrade caused some issues because PDFViewer layout now appears to size only the <canvas> correctly for each page - with the containing div.page potentially including additional space. Had to refactor the way our annotation overlay manages its sizing with a bit of an ugly hack to accommodate this. The viewer overall was also not properly detecting itself as visible, and therefore not initiating page renders, hence adding a little transparent border to force the viewer div to have some initial on-screen size.

Also (as noted in tsconfig.json) because of an open issue with vue-tsc, we've been able to upgrade to TypeScript v5 but are currently ignoring deprecation warnings and pinned to <5.5.

Testing done:

Built and deployed the UI to an existing pipeline (using Tesseract OCR) and tested in Amazon A2I in Firefox ESR. Page also tested locally in Chrome.


By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.