toofishes / python-pgpdump

PGP packet parser library
Other
55 stars 26 forks source link

Made UID Packet parsing more robust against malformed UTF-8 encoding #5

Closed muelli closed 12 years ago

muelli commented 12 years ago

I encountered a problem with the following packets:

<PublicSubkeyPacket: 0x6AC7E18804DC3ED5, ElGamal Encrypt-Only, length 525> Reading from byte 87414 <SignaturePacket: DSA Digital Signature Algorithm, SHA1, length 70> Reading from byte 87484 <PublicKeyPacket: 0x6FF2E1C3EE39785F, DSA Digital Signature Algorithm, length 418> Traceback (most recent call last): File "/home/muelli/hg/openpgp-things/mypgpdump.py", line 38, in sys.exit(main(sys.argv)) File "/home/muelli/hg/openpgp-things/mypgpdump.py", line 23, in main for packet in data.packets(): File "/home/muelli/git/python-pgpdump/pgpdump/data.py", line 30, in packets total_length, packet = construct_packet(self.data, offset) File "/home/muelli/git/python-pgpdump/pgpdump/packet.py", line 562, in construct_packet packet = PacketType(tag, name, new, packet_data) File "/home/muelli/git/python-pgpdump/pgpdump/packet.py", line 410, in init super(UserIDPacket, self).init(_args, *_kwargs) File "/home/muelli/git/python-pgpdump/pgpdump/packet.py", line 21, in init self.parse() File "/home/muelli/git/python-pgpdump/pgpdump/packet.py", line 413, in parse self.user = self.data.decode('utf8') File "/tmp/pgpdump/lib64/python2.7/encodings/utf_8.py", line 16, in decode return codecs.utf_8_decode(input, errors, True) UnicodeDecodeError: 'utf8' codec can't decode byte 0xf8 in position 7: invalid start byte

after the change, it decoded the packet to: <UserIDPacket: u'Jan Tet\ufffdev' (u'tetrev@guns-info.cz'), length 32>

toofishes commented 12 years ago

Thanks! I pulled your patch into my local copy, will be pushed soon. As an FYI, this particular user ID string was encoded using ISO8859-2.