silvhua / expense-classifier-backend

0 stars 1 forks source link

Backend `ParserFunction`: Parse & prepare receipt data for classification by receipt total #4

Closed silvhua closed 1 week ago

silvhua commented 2 weeks ago

The ParserFunction is in app.py. It currently parses a receipt that was uploaded to s3 from the front end and returns a dataframe with the parsed data.

The returned dataframe could be refined to only contain the essential information for the task.

To invoke the function, see back end set up task

Data processing should account for how receipts parsing likely won't be consistent due to different formatting of dates, times, totals, addresses, etc.

silvhua commented 1 week ago

Done in commit e5fe04553b5c5d653d4d5d85268ea7fe5e768856

Text output from AWS Lambda console:

{
  "statusCode": 200,
  "body": {
    "line_items": [
      {
        "id": "9",
        "mention_text": "250.00",
        "type": "line_item/amount",
        "currency_code": "USD",
        "units": 250,
        "normalized_value": "250",
        "confidence": "84%",
        "pages": [
          1
        ]
      },
      {
        "id": "14",
        "mention_text": "322.81",
        "type": "line_item/amount",
        "currency_code": "USD",
        "units": 322,
        "nanos": 810000000,
        "normalized_value": "322.81"
      },
      {
        "id": "16",
        "mention_text": "322.81",
        "type": "line_item/amount",
        "currency_code": "USD",
        "units": 322,
        "nanos": 810000000,
        "normalized_value": "322.81"
      }
    ],
    "total_amount": {
      "id": "1",
      "mention_text": "250.00",
      "type": "total_amount",
      "currency_code": "USD",
      "units": 250,
      "normalized_value": "250"
    },
    "receipt_date": {
      "id": "5",
      "mention_text": "16-Oct-2021",
      "type": "receipt_date",
      "year": 2021,
      "month": 10,
      "day": 16,
      "normalized_value": "2021-10-16"
    },
    "supplier_address": {
      "id": "6",
      "mention_text": "661 University Ave., Toronto, ON M5G 1M1",
      "type": "supplier_address"
    },
    "supplier_name": {
      "id": "7",
      "mention_text": "",
      "type": "supplier_name",
      "normalized_value": "CrossFit BC"
    },
    "supplier_city": {
      "id": "19",
      "mention_text": "",
      "type": "supplier_city",
      "normalized_value": "Toronto"
    }
  }
}