andreas-mausch / whatsapp-viewer

Small tool to display chats from the Android msgstore.db database (crypt12)
https://andreas-mausch.de/whatsapp-viewer/
MIT License
1.24k stars 381 forks source link

Show filenames of images and videos #22

Closed andreas-mausch closed 4 years ago

andreas-mausch commented 8 years ago

Last time I checked (Feb 2015), WhatsApp stored serialized Java objects in the SQLite database. I'd need to write a Java deserializer in C++ to extract them.

~/Downloads> sqlite3 msgstore.db "select quote(thumb_image) from messages where _id = 253;"
X'ACED000573720016636F6D2E77686174736170702E4D6564696144617461FFF496EDE1A230060200044A000866696C6553697A654A000870726F67726573735A000B7472616E736665727265644C000466696C6574000E4C6A6176612F696F2F46696C653B7870000000000000DF900000000000000064017372000C6A6176612E696F2E46696C65042DA4450E0DE4FF0300014C0004706174687400124C6A6176612F6C616E672F537472696E673B78707400422F6D6E742F7364636172642F57686174734170702F4D656469612F576861747341707020496D616765732F494D472D32303132313031322D5741303030312E6A70677702002F78'

screenshot

seancmonahan commented 8 years ago

Would you like assistance? I'm a decent C++ programmer.

As an alternative to implementing a full Java deserializer in C++, what about using a simple regex to extract the path, just as a temporary workaround? Or are there other parts in which the SQLite db contains Java serialized objects?

I only just this morning used your WhatsApp-Viewer: WhatsApp makes it quite the pain in the ass to get your own messages out of it! Especially on Marshmallow, which supports "allowBackup=false" in an apk's manifest, blocking adb backups of its data. I had to install a modified boot.img with a different kernel and a couple other changes, a non-stock recovery (TWRP), and SuperSU root to get access to WhatsApp's data! (source)

I've forked your git repo, and will start looking over the source to get a grasp of the code layout.

I look forward to contributing!

-Sean

andreas-mausch commented 8 years ago

Hi Sean,

sure, I think a lot of people will appreciate new features. As you can see personally I haven't worked on WhatsApp Viewer for quite a while now, however I would try to merge any pull requests quickly.

Yea, rooting your phone is one option to bypass the allowBackups option. I am not sure whether this other method still works (it replaced the WhatsApp APK with an older version which doesn't have the allowBackups option): http://forum.xda-developers.com/showthread.php?t=2770982 Have you tried it?

Code is not terrible but also not clean (way too much stuff in Win32 folder). If you have any questions just write me an email.

what about using a simple regex to extract the path, just as a temporary workaround?

I am not sure if a regex consistently works. I am also not sure if the structure of the Java class MediaData ever changed over different WhatsApp versions. Feel free to experiment.

Or are there other parts in which the SQLite db contains Java serialized objects?

Not sure.

Thanks for your assistance. Andreas

seancmonahan commented 8 years ago

Hi Andreas,

I look forward to helping out!

I am not sure whether this other method still works (it replaced the WhatsApp APK with an older version which doesn't have the allowBackups option): http://forum.xda-developers.com/showthread.php?t=2770982 Have you tried it?

I did try that first, but the version of WhatsApp included in that workaround isn't compatible with Android 6. It wouldn't install successfully, so I had to root. Thankfully I had already unlocked the bootloader so I didn't have to wipe.

I'm going to try unpacking the latest WhatsApp apk to modify the manifest to "allowBackups=true", but I'm not sure if that will even work.

Cheers, Sean

rellieberman commented 7 years ago

Would be great if once this feature is implemented and the file names are readable, the actual pictures and videos could be viewed. The user would just need to supply the media folder from his phone.

Thank for the program, very useful as it is too :)

andreas-mausch commented 6 years ago

A current database schema showed there now is messages.media_name

andreas-mausch commented 6 years ago

media_name is only set for some files (sent and not received?)

so java deserialization needs to be done. https://github.com/tcalmant/python-javaobj/blob/master/javaobj.py

with open('blob', 'rb') as file:
    pobj = javaobj.loads(file.read())
    print(pobj.file.path)

prints

Media/WhatsApp Images/Sent/IMG-20171013-WA0001.jpg
andreas-mausch commented 6 years ago

https://www.javaworld.com/article/2072752/the-java-serialization-algorithm-revealed.html

andreas-mausch commented 6 years ago

Ok I tried to implement this by matching a binary pattern. It is very hacky. Let's see how long it is compatible with the latest WhatsApp version.

andreas-mausch commented 6 years ago
migueltg commented 4 years ago

@andreas-mausch Do you know how can i extract Audio/Video/Image file path in Python? I don´t undertand well your process. Is this? Thanks you very much for your help

andreas-mausch commented 4 years ago

It is.

Ok I tried to implement this by matching a binary pattern. It is very hacky. Let's see how long it is compatible with the latest WhatsApp version.

migueltg commented 4 years ago

It is.

Ok I tried to implement this by matching a binary pattern. It is very hacky. Let's see how long it is compatible with the latest WhatsApp version.

I am trying with that but i always obtain same error: The stream is not java serialized object. Invalid stream header: 41434544 The field is thumb_image?

migueltg commented 4 years ago

It is.

Ok I tried to implement this by matching a binary pattern. It is very hacky. Let's see how long it is compatible with the latest WhatsApp version.

I am trying with that but i always obtain same error: The stream is not java serialized object. Invalid stream header: 41434544 The field is thumb_image?

Working!! Thanks a lot @andreas-mausch 👏 👏