SpamScope / mail-parser

Tokenizer for raw mails
https://pypi.python.org/pypi/mail-parser
Apache License 2.0
367 stars 87 forks source link

When parsing eml attachment from Gmail, the attachment is being parsed as email instead as attachment #103

Closed JasBeilin closed 1 week ago

JasBeilin commented 2 years ago

Describe the bug When parsing a message with an eml attachment (and the attachment contains an image, for example), the eml is parsed as a message as well instead of an attachment, so the image within is also being parsed as attachment of the external message.

To Reproduce Steps to reproduce the behavior:

  1. Create an email with an image attachment, send it to an available inbox.
  2. forward the specific email from (1) as an attachment.
  3. run parse_from_bytes/ parse_from_file/ parse_from_str
  4. See that the only attachment is the image, not the eml.

Expected behavior I expect to see all attachments, including the eml file itself.

Raw mail

testing eml parsing with attachment copy.eml.zip

Environment:

Additional context I used this code to parse it:

class mailParser:
    def run(self, raw_email):
        from mailparser import parse_from_bytes, parse_from_file
        return parse_from_bytes(bytes(raw_email))

parser = mailParser()
with open('testing eml parsing with attachment.eml', 'rb') as fhdl:
    raw_email = fhdl.read()
res = parser.run(raw_email)
print(res)

The results (one attachment- the inner image) image

Possible solution: I tried changing line 353 in the mailparser.py to if not p.is_multipart() or 'attachment' in p.get('content-disposition', '') So attachments will be able to be processes as attachments even if they are multipart/message. and added this to line 279:

                        is_attachment = True
                        payload = p.get_payload()
                        filename = dict(payload[0]._headers).get('Subject')

As result, I got:

image (2 attachments, both the image and the eml file)

sgeulette commented 2 years ago

Hello, I have the same problem... Regards

fedelemantuano commented 2 years ago

Thanks for this submission. I will check the issue.

fedelemantuano commented 1 week ago

To review in the new version. I will open a new issue if needed.