sepinf-inc / IPED

IPED Digital Forensic Tool. It is an open source software that can be used to process and analyze digital evidence, often seized at crime scenes by law enforcement or in a corporate investigation by private examiners.
Other
893 stars 214 forks source link

Support some recent Telegram iOS chats not being parsed #1998

Closed wladimirleite closed 5 months ago

wladimirleite commented 7 months ago

Working on a real case with many iOS Telegram databases (db_sqlite), I noticed that no chat was extracted by IPED. Checking the log, there were many exceptions, most of them AIOOB. For some of the chats I have the chats generated by Cellebrite PA, but part of them were missed by PA because they were stored in different paths (stored by Nicegram not Telegram), so it would be great if IPED can parse them.

I collected iOS Telegram database samples from other cases, and most of the recent databases could not be parsed either.

Debugging the issue I found that, at least since Oct/2021 (date that Telegram code related to this was changed), if some flags are set (e.g. forwarded message), the parser fails because it doesn't skip the information present when these flags are set.

I think this issue is not related to #1976 because when there is an exception reading a message, the parsing is aborted and the chat subitem is not created.

Anyway, I reviewed the methods readMessage and readForwardInfo in PostBoxCoding class and now all the samples I have (including old ones) are being parsed.

lfcnassif commented 7 months ago

Thank you very much @wladimirleite for reporting this relevant situation, debugging it and sending a PR!

Just a minor question: do you think this is really a bug fix or support for new Telegram versions?

wladimirleite commented 7 months ago

Good question! Initially I thought it was a bug handling specific DB content, but as it happens with many others recent DBs, I guess it is more an enhancement to handle newer Telegram versions. There are a couple of minor details that I think our parser was a bit different than the reference implementation, but just very minor (possibly harmless) differences.

wladimirleite commented 7 months ago

Changed to enhancement and edited the issue title to make it clearer.

By the way, the decoding code is quite complex, so I made some "house keeping" changes (like renaming some variables), to make the comparison with the reference implementation a bit easier, which may be helpful in the future.