mysociety / ipvtheme

The Alaveteli theme for Informace pro vsechny (Czech Republic)
http://infoprovsechny.cz
MIT License
3 stars 4 forks source link

Can't parse incoming messages with Content-Type: application/pkcs7-mime #40

Open garethrees opened 6 years ago

garethrees commented 6 years ago

http://www.infoprovsechny.cz/request/dokumentace_k_projektu_socialni?nocache=incoming-11330#incoming-11330

screen shot 2017-10-05 at 12 51 26

garethrees commented 6 years ago

Also http://www.infoprovsechny.cz/request/ucastnici_rizeni_podle_sidla_9

garethrees commented 6 years ago

Another case here https://groups.google.com/a/mysociety.org/forum/#!topic/alaveteli/d6cLMhmAk9Y

garethrees commented 5 years ago

Well, here's how you can extract the PDF from one example:

f = File.read('tmp/11378.eml')
m = Mail.new(f)
a = OpenSSL::ASN1.decode(Base64.decode64(m.body.raw_source))
v = a.value[1].value[0].value[2].value[1].value[0].value.map(&:value).join
eml = Mail.new(v)
File.open('tmp/blah.pdf', 'wb') { |f| f.write(eml.attachments.first.decoded) }
File.open('tmp/blah.pdf', &:gets)
# => "%PDF-1.3\r1 0 obj\r<</Type /XObject /Subtype /Image /Name /Im1 /Width 1240 /Height 1753 /Length 108309/ColorSpace /DeviceRGB /BitsPerComponent 8 /Filter [ /DCTDecode ] >> stream\r\xFF\xD8\xFF\xE0\u0000\u0010JFIF\u0000\u0001\u0001\u0001\u0000\x96\u0000\x96\u0000\u0000\xFF\xDB\u0000C\u0000\b\u0006\u0006\a\u0006\u0005\b\a\a\a\t\t\b\n"
dracos commented 5 years ago

This is a bit nicer (ruby provides no docs for this class, https://stackoverflow.com/a/1999887/669631 was of most help):

...
a = OpenSSL::PKCS7.new(Base64.decode64(m.body.raw_source))
a.verify([], OpenSSL::X509::Store.new, nil, OpenSSL::PKCS7::NOVERIFY) # Don't verify anything!
v = a.data
...
garethrees commented 5 years ago

Now we know how to unwrap the pkcs7-mime, we need to do inject this in to the mail parsing somewhere. I think the best place to start looking is MailHandler.decode_attached_part, but I need to figure out how all the mail parsing works.

lukaskovarik commented 4 years ago

Hi, everyone, can we help with resolving this issue? Unfortunately I don't have Ruby dev expertise, but I have a Python team here in Prague, would that be of any help? I realise it's a long shot, but maybe there's some language-independent research to be done, for example?

garethrees commented 4 years ago

Looks like all the tools we need are available to us to correctly parse the content type – its more of a case of integrating them in to Alaveteli.

lukaskovarik commented 4 years ago

OK, understand. If there's anything we can help with in the future, please do let me know! Thanks again for having a look at this, much appreciated.