cgsecurity / testdisk

TestDisk & PhotoRec
https://www.cgsecurity.org/
GNU General Public License v2.0
1.58k stars 195 forks source link

Misidentified .emlx file can't find end marker #16

Open victorvde opened 7 years ago

victorvde commented 7 years ago

During a recovery run on an NTFS volume to recover .txt, .tx? and .vdi, about halfway through an old Claws Mail mailbox (MH format I think) embedded in an old, possibly even deleted .vdi (Virtualbox Virtual Machine, probably ext4) was misidentified as a .emlx file. I don't care about this mailbox, but because there was no "\</plist>" end marker the file grew to many gigabytes and made further recovery impossible. This happened reliably with both photorec_win.exe and qphotorec_win.exe (7.0 Sat Apr 18 13:02:01 CEST 2015).

I resolved this by upx decompressing photorec_win.exe and hexediting the strings "Return-Path: " and "Received: from" to start with a "D" instead of an "R". This is a dirty hack but it works.

I'm just letting you know so you could make this great piece of software even better. Maybe by making it possible to disable .emlx files in some way or making the detection more robust?

Relevant piece of code: https://git.cgsecurity.org/cgit/testdisk/tree/src/file_txt.c?id=ae341302369a4a07feda0e94b4ff432217ee3916#n996