Danamir / imap-attachment-extractor

IMAP attachment exporter, with optional Thunderbird detach mode
11 stars 4 forks source link

Python-error (large mailbox) #7

Open tschloss opened 1 year ago

tschloss commented 1 year ago

Hi (great tool - thanks for providing it!!!)

I tried with a smaller mailbox first: no issues. Second test was with a mailbox with >65656 messages in INBOX. macOS Intel - recent version - Py 3.10 (brew) I see this result:

(.env) tschloss@Mac-mini imap-attachment-extractor % imap_aex --password -d "2022-08-07"
Password:
[Dry-run] Create extract dir /Users/tschloss/Programming/imap-attachment-extractor/INBOX.
Selected folder 'INBOX' (64556 mails).
45 messages corresponding to search.
1 messages with attachments.

Parsing mail: 'cronjob p487261 Update mvonline' [2022-08-07 05:40:06]
  Attachment 'p487261_Update mvonline.log' size (129.0B) is smaller than defined threshold (100.0KB), leave intact.
  Nothing extracted.
Traceback (most recent call last):
  File "/Users/tschloss/Programming/imap-attachment-extractor/.env/bin/imap_aex", line 33, in <module>
    sys.exit(load_entry_point('imap-attachment-extractor', 'console_scripts', 'imap_aex')())
  File "/Users/tschloss/Programming/imap-attachment-extractor/imap_aex.py", line 860, in cli
    main(options, defaults)
  File "/Users/tschloss/Programming/imap-attachment-extractor/imap_aex.py", line 852, in main
    imap.extract(**extract_kwargs)
  File "/Users/tschloss/Programming/imap-attachment-extractor/imap_aex.py", line 422, in extract
    mail = message_from_bytes(fetch[1])  # type: EmailMessage
  File "/usr/local/Cellar/python@3.10/3.10.6_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/email/__init__.py", line 46, in message_from_bytes
    return BytesParser(*args, **kws).parsebytes(s)
  File "/usr/local/Cellar/python@3.10/3.10.6_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/email/parser.py", line 122, in parsebytes
    text = text.decode('ASCII', errors='surrogateescape')
AttributeError: 'int' object has no attribute 'decode'

Any ideas? Something wrong on my side?

Thank you Thomas

Danamir commented 1 year ago

Hi,

Curiously the problem seems to originate from this email in particular. A returned mail part contains an integer instead of a string. I could add a quick fix to avoid the crash by converting the int to str, but it won't change the fact that the body of the message is not correctly found.

Danamir commented 1 year ago

I added a catch to this code portion to display an error message instead of crashing. You can then check the full message body with the --verbose option.