sepinf-inc / IPED

IPED Digital Forensic Tool. It is an open source software that can be used to process and analyze digital evidence, often seized at crime scenes by law enforcement or in a corporate investigation by private examiners.
Other
924 stars 217 forks source link

WhatsApp quote messages #1916

Closed gfd2020 closed 10 months ago

gfd2020 commented 10 months ago

Whatsapp allows you to quote messages. It would be interesting if IPED could detect these messages to improve the context of conversations.

I'm already doing a PR to address this. Example below: The behavior is the same as on WhatsApp. When clicking, the conversation is scrolled and marked the message with a color that disappears.

https://github.com/sepinf-inc/IPED/assets/59742865/9f9e5d4f-d231-4014-846b-db22c57a5a1a

lfcnassif commented 10 months ago

Thank you @gfd2020, this would be very useful!

But, there is an existing open ticket for this (#693) and I think @thalespr was working on it, not sure about his progress on it. Can I close this as duplicate and could you work together with him?

gfd2020 commented 10 months ago

Thank you @gfd2020, this would be very useful!

But, there is an existing open ticket for this (#693) and I think @thalespr was working on it, not sure about his progress on it. Can I close this as duplicate and could you work together with him?

Oh! I didn't know that already had a ticket like that, sorry. You can close then. My implementation is already well advanced. Only the media cases remain to be dealt with, text is already working. I can work with him.

lfcnassif commented 10 months ago

Ok, thank you, let's continue the conversation on #693.

wladimirleite commented 7 months ago

Processing a large WhatsApp Android DB (provided by @aberenguel), it is taking ~4 hours to extract quoted messages, while the "regular" messages are extracted in ~50 seconds. The database file has ~1.8 GB. It contains ~1.12M messages, 61K chats and 168K quoted messages.

It is a similar situation described in #1889. With a huge number of chats (61K), the query to get quoted messages will be executed 61K times. This query is taking between 200 and 500 ms, which is relatively fast, but multiplied by 61K, it will take several hours. This is an extreme case, but having a few hundred chats is somewhat common.

I applied the same technique used in #1889, querying just once, not per chat. It reduced the time to extract quoted messages to only 10 seconds. I am including this in PR #2048.

lfcnassif commented 7 months ago

Processing a large WhatsApp Android DB (provided by @aberenguel), it is taking ~4 hours to extract quoted messages, while the "regular" messages are extracted in ~50 seconds. The database file has ~1.8 GB. It contains ~1.12M messages, 61K chats and 168K quoted messages.

It is a similar situation described in #1889. With a huge number of chats (61K), the query to get quoted messages will be executed 61K times. This query is taking between 200 and 500 ms, which is relatively fast, but multiplied by 61K, it will take several hours. This is an extreme case, but having a few hundred chats is somewhat common.

I applied the same technique used in #1889, querying just once, not per chat. It reduced the time to extract quoted messages to only 10 seconds. I am including this in PR #2048.

Thank you very much @wladimirleite for investigating, debugging times and fixing this performance regression!