sepinf-inc / IPED

IPED Digital Forensic Tool. It is an open source software that can be used to process and analyze digital evidence, often seized at crime scenes by law enforcement or in a corporate investigation by private examiners.
Other
936 stars 218 forks source link

Merge Whatsapp msgstore.db backups #150

Closed LeonardoDiasRR closed 2 years ago

LeonardoDiasRR commented 4 years ago

Hi,

The Whatsapp parser handle the main msgstore.db and the copies of it separately, generating many chats for each "real" chat. It's cause many missunderstanding for the analists.

It should be very good if main msgstore.db and it's copies would be merged into one unique msgstore. It would also reduce the numbers of itens related to whatsapp chats.

fmpfeifer commented 4 years ago

I don't think it is possible to merge those files. Normally, when there are many msgstore.db, they are copies taken at different times (backups). The user might have deleted some messages from a chat, for example, and those messages could be found in an older version of the file. And also, if we simply merge the database and bring those messages back, we are losing a valuable information (that the user deleted those messages), just to say an example. Maybe we can use the "Filter Listed Duplicates" option to get rid of the identical chats.

lfcnassif commented 4 years ago

Maybe the extract wa individual messages feature of next release could be used as another approach for this? Users could analyze messagens in table view, so they could view all of them. But duplicate messages will appear, duplicate filter will not work because individual messges have no hash. Maybe a specific filter to remove messages with same body and date could help?

lfcnassif commented 3 years ago

I did some updates to the initial implementation of @hauck-jvsh in merge_msgstore branch and merged it in master. But the feature was disabled for now by default, until all TODOs in WhatsAppParser are addressed.

LeonardoDiasRR commented 3 years ago

Pfeifer,

I beleive it's possible to merge the content of all msgstore.db databases creating a single chat in IPED, like Cellebrite does in UFED PA. As Nassif said, with the Instant Messages (IM) extract feature, it's possible to compare the content and the timestamp of each IM (or it's hash) and rebuild the timeline's chat, merging all msgstore.db contents. In this case, it's necessary to ignore the duplicated and mark or sign (red color?) the IMs which were deleted.

It will be VERY usefull for the analists working in the case.

fmpfeifer commented 3 years ago

Very good idea of tagging the deleted messages. I think that this way we won't loose any information. For sure it will be a huge improvement.

lfcnassif commented 3 years ago

Deleted message warning was already implemented by @hauck-jvsh . @LeonardoDiasRR there are still some TODOs to address in code, if you can help, contributions are welcome.

lfcnassif commented 2 years ago

@LeonardoDiasRR see the work being done on #923, if you could help with testing, that would be great.

hauck-jvsh commented 2 years ago

Now it allows more than one mainDB in each evidence and it also identifies backups in other evidences. This may occur when the memory is extracted in another evidence and the backups were in it. @lfcnassif Just one thing,

EmbeddedDocumentExtractor extractor = context.get(EmbeddedDocumentExtractor.class,
                    new ParsingEmbeddedDocumentExtractor(context));

if this code is executed the preview for the item is not being rendered, do you know why?

lfcnassif commented 2 years ago

if this code is executed the preview for the item is not being rendered, do you know why?

No, it shouldn't have anything related to rendering previews...

Let me know when the PR is ready for review, and I will take a look.

@LeonardoDiasRR, as you are interested in this feature, let me know if you can help with testing, or not.

hauck-jvsh commented 2 years ago

I think it is ready for review.

lfcnassif commented 2 years ago

Thanks @hauck-jvsh, I'll test this after giving @LeonardoDiasRR some time to help us with testing.

lfcnassif commented 2 years ago

Are there ChatStorage.sqlite backups in iOS devices? If they could be identified by a simple name pattern, this feature could be applied to them too.

lfcnassif commented 2 years ago

Hello, I tried to get the msgstore merge to work and I was not successful - see - I have the main msgstore.db and I have a backup that I decrypted and put everything in the same folder and processed with IPED - the result was that duplicate chats were generated and I didn't get the expected result from the merge - where am I wrong? Is there a specific folder to place the decrypted databases?

The backup DB file should have the following name pattern (used by default by Android WhatsApp) to pass the first detection phase: msgstore-2022-07-04.db. What is the name of your backup DB?