aws-samples / amazon-textract-textractor

Analyze documents with Amazon Textract and generate output in multiple formats.
Apache License 2.0
408 stars 145 forks source link

Fix invalid escape in BoundingBox docstring #395

Closed simonschmidt closed 3 weeks ago

simonschmidt commented 2 months ago

The BoundingBox docstring uses some latex style \in, but python sees the \i as an invalid escape and issues a SyntaxWarning:

$ python -Werror -c 'import textractor'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/app/amazon-textract-textractor/textractor/__init__.py", line 3, in <module>
    from .textractor import Textractor
  File "/app/amazon-textract-textractor/textractor/textractor.py", line 53, in <module>
    from textractor.entities.document import Document
  File "/app/amazon-textract-textractor/textractor/entities/document.py", line 17, in <module>
    from textractor.entities.expense_document import ExpenseDocument
  File "/app/amazon-textract-textractor/textractor/entities/expense_document.py", line 11, in <module>
    from textractor.entities.expense_field import (
  File "/app/amazon-textract-textractor/textractor/entities/expense_field.py", line 6, in <module>
    from textractor.entities.bbox import BoundingBox
  File "/app/amazon-textract-textractor/textractor/entities/bbox.py", line 32
    """
    ^^^
SyntaxError: invalid escape sequence '\i'

Fixed by properly escaping the backslash so now it's \\in

Belval commented 3 weeks ago

Thank you for your contribution