sepinf-inc / IPED

IPED Digital Forensic Tool. It is an open source software that can be used to process and analyze digital evidence, often seized at crime scenes by law enforcement or in a corporate investigation by private examiners.
Other
982 stars 220 forks source link

Error parsing WhatsApp DB (Comparison method violates its general contract) #2337

Closed wladimirleite closed 3 weeks ago

wladimirleite commented 1 month ago

Another user reported the following error (it happens both with 4.1.x and master). Analysing the database, the issue is caused by the way messages are sorted. It may fail when a very specific combination of records (with zeroes or null values in the columns used) is present, which is the case of the triggering database. I will submit a fix shortly.

2024-10-15 09:32:48 [ERROR] [parsers.whatsapp.WhatsAppParser]           Error parsing WhatsApp: Item: XXX/msgstore.db type: sqlite size: 61648896
org.apache.tika.exception.TikaException: WAExtractorException Exception
    at iped.parsers.whatsapp.WhatsAppParser.parseWhatsappMessages(WhatsAppParser.java:399) ~[iped-parsers-impl-4.2-snapshot.jar:?]
    at iped.parsers.whatsapp.WhatsAppParser.parse(WhatsAppParser.java:259) [iped-parsers-impl-4.2-snapshot.jar:?]
    at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:298) [tika-core-2.4.0-p1.jar:2.4.0]
    at iped.parsers.standard.StandardParser.parse(StandardParser.java:245) [iped-parsers-impl-4.2-snapshot.jar:?]
    at iped.engine.io.ParsingReader$BackgroundParsing.run(ParsingReader.java:247) [iped-engine-4.2-snapshot.jar:?]
    at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) [?:?]
    at java.util.concurrent.FutureTask.run(Unknown Source) [?:?]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [?:?]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [?:?]
    at java.lang.Thread.run(Unknown Source) [?:?]
Caused by: java.lang.IllegalArgumentException: Comparison method violates its general contract!
    at java.util.ComparableTimSort.mergeHi(Unknown Source) ~[?:?]
    at java.util.ComparableTimSort.mergeAt(Unknown Source) ~[?:?]
    at java.util.ComparableTimSort.mergeForceCollapse(Unknown Source) ~[?:?]
    at java.util.ComparableTimSort.sort(Unknown Source) ~[?:?]
    at java.util.Arrays.sort(Unknown Source) ~[?:?]
    at java.util.Arrays.sort(Unknown Source) ~[?:?]
    at java.util.ArrayList.sort(Unknown Source) ~[?:?]
    at java.util.Collections.sort(Unknown Source) ~[?:?]
    at iped.parsers.whatsapp.ExtractorAndroidNew.extractChatList(ExtractorAndroidNew.java:150) ~[iped-parsers-impl-4.2-snapshot.jar:?]
    at iped.parsers.whatsapp.Extractor.getChatList(Extractor.java:34) ~[iped-parsers-impl-4.2-snapshot.jar:?]
    at iped.parsers.whatsapp.WhatsAppParser.parseWhatsappMessages(WhatsAppParser.java:391) ~[iped-parsers-impl-4.2-snapshot.jar:?]
    ... 9 more
wladimirleite commented 1 month ago

Out of curiosity, the problem is caused when we try to sort items and some of them don't have a "linear" order (A > B, B > C and C > A). The comparator used by the WhatsAppParser is something like:

public int compareTo(Message o) {
    if (a != 0 && o.a != 0) {
        int cmp = Integer.compare(a, o.a);
        if (cmp != 0) return cmp;
    }
    if (b != 0 && o.b != 0) {
        int cmp = Integer.compare(b, o.b);
        if (cmp != 0) return cmp;
    }
    return Integer.compare(c, o.c);
}

If we have items like X = {a=0, b=2, c=1}, Y = {a=2, b=1, c=0}, Z = {a=1, b=0, c=2}, then: X > Y, Y > Z and Z > X, which will cause an exception if we try to sort them (in fact, there must be at least 32 items, so the merge function is used).

lfcnassif commented 1 month ago

Closed by #2352.

wladimirleite commented 3 weeks ago

Sorry @lfcnassif, but I will reopen this once again, as there is still an issue when backups are merged. Merging process sort messages, but it is not possible to use Message.sort() as later there are binary searches that rely on the "regular" sorting (Collections.sort(), which uses only the Comparator implemented by Message class). The solution I found is to keep the merging code as it is, and after the merging process, call Message.sort(). I will submit a PR with this additional fix.

lfcnassif commented 3 weeks ago

Don't worry and thank you @wladimirleite for continuously checking the changes. I thought this could be very tricky when merging DBs with different sorting criteria, but didn't test the changes, I'm sorry about that.