rjbs / Email-MIME

perl library for parsing MIME messages
20 stars 30 forks source link

Corrupt filename is returned when name*0=... name*1=... #60

Closed bokutin closed 3 years ago

bokutin commented 5 years ago

Hi.

We encountered a bug where Email::MIME::filename () did not return the correct filename.

I think the cause is that the order in which mime_decode and parse_content_type work is reversed.

Write down the sample code.

This email is an email forwarding received from Gmail. It is correctly displayed as "わたあめカメラ_20190401_191226.jpg" on Gmail.

#!/usr/local/bin/perl

use strict;
use Email::MIME;
use Email::MIME::ContentType;
use Email::MIME::Encode;
$Email::MIME::ContentType::STRICT_PARAMS = 0;

my $raw_email = <<'EMAIL';
From dummy1@gmail.com  Mon Apr  1 19:34:36 2019
Content-Type: multipart/mixed;
    boundary="=_3a1019ab7cce322859164c2d9ceac6c9"
Date: Mon, 01 Apr 2019 12:00:00 +0900
From: dummy2@docomo.ne.jp
To: dummy3@gmail.com
Subject: dummy subject

--=_3a1019ab7cce322859164c2d9ceac6c9
Content-ID: <01@190401.071418.gif>
Content-Type: image/gif;
 name="de_01_1071.gif"; 

--=_3a1019ab7cce322859164c2d9ceac6c9
Content-Type: image/jpeg;
 name*0="=?utf-8?B?44KP44Gf44GC44KB44Kr44Oh44OpXzIwMTkwNDAxXzE5MTIyNi5q?="
 name*1=" =?utf-8?B?cGc=?=";

--=_3a1019ab7cce322859164c2d9ceac6c9--
EMAIL

{
    my $email = Email::MIME->new($raw_email);
    my @parts = $email->parts;
    warn $parts[0]->filename; # de_01_1071.gif
    warn $parts[1]->filename; # "わたあめカメラ_20190401_191226.j" pg
}

my $raw_ct = 'image/jpeg; name*0="=?utf-8?B?44KP44Gf44GC44KB44Kr44Oh44OpXzIwMTkwNDAxXzE5MTIyNi5q?=" name*1=" =?utf-8?B?cGc=?="';
{
    my $decoded_ct   = Email::MIME::Encode::mime_decode($raw_ct);
    my $ct           = parse_content_type($decoded_ct);
    my $decoded_name = $ct->{attributes}{name};
    warn $decoded_name; # "わたあめカメラ_20190401_191226.j" pg
}
{
    my $ct           = parse_content_type($raw_ct);
    my $encoded_name = $ct->{attributes}{name};
    my $decoded_name = Email::MIME::Encode::mime_decode($encoded_name);
    warn $decoded_name; # わたあめカメラ_20190401_191226.jpg
}

__END__

% ./bug.pl
de_01_1071.gif at ./bug.pl line 34.
Wide character in warn at ./bug.pl line 35.
"わたあめカメラ_20190401_191226.j" pg at ./bug.pl line 35.
Wide character in warn at ./bug.pl line 43.
"わたあめカメラ_20190401_191226.j" pg at ./bug.pl line 43.
Wide character in warn at ./bug.pl line 49.
わたあめカメラ_20190401_191226.jpg at ./bug.pl line 49.
pali commented 5 years ago

Hi @bokutin. This is duplicate of issue https://github.com/rjbs/Email-MIME/issues/31 Syntax name0=... name1=... is not supported yet. I created pull request which adds support for long file names via that syntax two years ago, see: https://github.com/rjbs/Email-MIME/pull/51 but seems that @rjbs as maintainer is not interested in fixing bugs nor reviewing/accepting patches from community... So do not expect that this bug would be fixed in near future in Email::MIME module. Probably never. Rather look for another module for parsing emails, this one seems like abandoned.

bokutin commented 5 years ago

Thank you for your reply.

I confirmed that Courriel would work without problems. https://metacpan.org/pod/Courriel

However, since Email::MIME is used in many productions, it will be very useful if it is corrected by the version upgrade.

I’m stuck...

pali commented 5 years ago

Look at discussion about Email:: modules: https://www.mail-archive.com/pep@perl.org/msg00557.html No answer for any of my email from maintainer for 11 months.

bokutin commented 3 years ago

Do you have any recommendations for alternatives to Email::MIME? @pali

pali commented 3 years ago

Seems that issue #31 was fixed and my pull request #51 was finally merged. So support for name*0=... name*1=... should be in git.

rjbs commented 3 years ago

I'm not sure exactly what the original submitter thinks is the correct behavior for their test program, but having run it with the latest versions on CPAN, I'm guessing it's still not right.

de_01_1071.gif at foo line 34.
=?utf-8?B?44KP44Gf44GC44KB44Kr44Oh44OpXzIwMTkwNDAxXzE5MTIyNi5q?= =?utf-8?B?cGc=?= at foo line 35.
Wide character in warn at foo line 43.
"わたあめカメラ_20190401_191226.j" pg at foo line 43.
Wide character in warn at foo line 49.
わたあめカメラ_20190401_191226.jpg at foo line 49.
pali commented 3 years ago

Maybe it is related to this issue https://github.com/rjbs/Email-MIME/issues/76 ?

rjbs commented 3 years ago

Yeah, could be. Blaaah email. Okay, I'll give this a real read-through tomorrow.

rjbs commented 3 years ago

(Thanks. ;) )

pali commented 3 years ago

... I'm looking at it again and issue #76 is just continue in discussion of this issue #60 by the same reported ... so I guess this one can be closed as discussion was moved to #76.

rjbs commented 3 years ago

Yeah. Really, all these things need to be made clearer for users, but I have no silver bullet…