benjamin-awd / StatementSensei

PDF to CSV conversion for your bank statements
https://statementsensei.streamlit.app/
GNU Affero General Public License v3.0
62 stars 9 forks source link

DBS consolidated statement RuntimeError: Could not convert date #24

Open xRahul opened 1 month ago

xRahul commented 1 month ago

Hi,

First of all, thanks for the awesome app!

I was able to parse all my statements this year, except Jan statement. That threw a runtime exeception:

RuntimeError: Could not convert date
Traceback:
File "streamlit\runtime\scriptrunner\exec_code.py", line 88, in exec_func_with_error_handling
File "streamlit\runtime\scriptrunner\script_runner.py", line 590, in code_to_exec
File "C:\Users\rahul\AppData\Local\StatementSensei\_internal\webapp\app.py", line 112, in <module>
    app()
File "C:\Users\rahul\AppData\Local\StatementSensei\_internal\webapp\app.py", line 24, in app
    processed_files = process_files(files)
                      ^^^^^^^^^^^^^^^^^^^^
File "C:\Users\rahul\AppData\Local\StatementSensei\_internal\webapp\app.py", line 60, in process_files
    processed_file = handle_file(document)
                     ^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\rahul\AppData\Local\StatementSensei\_internal\webapp\app.py", line 75, in handle_file
    file = parse_bank_statement(document)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "webapp\helpers.py", line 53, in parse_bank_statement
    processed_file = ProcessedFile(pipeline.transform(statement), metadata)
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "monopoly\pipeline.py", line 92, in transform
File "monopoly\pipeline.py", line 87, in convert_date

The statement looks basically the same as Feb statement. So shouldn't be a format issue. Any idea how I can debug which date failed, or where in the PDF?

Some more debug logs maybe?

benjamin-awd commented 1 month ago

Hey @xRahul, glad to hear that the app is working decently for you (minus this particular statement).

For more detailed logs you can try using the monopoly library which is what StatementSensei uses under the hood: https://github.com/benjamin-awd/monopoly

This allows you to do something like:

monopoly dbs-statement.pdf --verbose --single-process --nosafe

You can then put a breakpoint() in the code here, which should allow for more detailed debugging.

benjamin-awd commented 1 month ago

My guess is that the regex pattern hit something that looks like a transaction, but can't properly resolve the date within it.