abrignoni / ALEAPP

Android Logs Events And Protobuf Parser
MIT License
463 stars 97 forks source link

changed serialization to json for datalist field in tl.db #497

Closed rick-slin closed 2 weeks ago

rick-slin commented 1 month ago

I'd like to change the serialization method of the datalist column of the tl.db timeline database to use JSON. This will allow other programs/scripts to use the data in that column.

See issue #496

Thank you for your consideration

stark4n6 commented 1 month ago

@rick-slin just ran a test and the results were interesting. The timeline DB alone was 27 GB's. It looks like you're writing the full JSON of a parser into each row of the DB for that artifact type, in essence making 400+ copies of the same data (ex. was Chrome History) image

I don't think that was the desired results.

rick-slin commented 1 month ago

Oups. Definitely not. I'll try again.

stark4n6 commented 1 month ago

@rick-slin seeing a bunch of type errors too for individual parsers, looks like it doesn't like datetime items image

rick-slin commented 1 month ago

Sorry, still not working, did not mean to upload.

Johann-PLW commented 1 month ago

Don't worry. Let us know when you think everything works as expected so we can test it on our side.

rick-slin commented 1 month ago

It now works as I expect it to. Let me know what you think.

Johann-PLW commented 1 month ago

@rick-slin,

With your updates, the result is now much better. I have tested your new code with Android 12 public image provided by Josh Hickman (https://downloads.digitalcorpora.org/corpora/mobile/android_12.zip). With 'Google_G013A Pixel 3.zip' file, all modules selected except SQLite Journaling, the tl.db contains 220 414 records, but with the serialization method 2 records are missing (78 289 & 78 290).

image

If you could have a look. Thank you so much.

rick-slin commented 1 month ago

I'll check it out

rick-slin commented 2 weeks ago

@Johann-PLW I'm not able to reproduce this issue. I've run the main branch and this one against the Android 12 public image you mentioned above with no additional arguments (python3 -m aleapp -t tar -o ~/aleapp_out/android12-json/ -i ~/samples/Android\ 12/TAR\ File/Android\ 12\ -\ Data.tar). I get 155,920 entries in the tl.db in both cases. Specifically as it relates to activity that start with "FCM-Dump-", I get 16,815 records in both cases. Could you please provide me with the profile you used?

Johann-PLW commented 2 weeks ago

@rick-slin I have tested three times on two different computers, without changing anything, and I was not able to reproduce the issue too. No record was missing. I dug a little deeper and found that the problem was during the parsing itself, as both records were missing in my html report and csv file. I'm really sorry to have wasted your time with this. Thank you very much for this great contribution.

stark4n6 commented 2 weeks ago

@Johann-PLW think we should mirror this to the other projects?

Johann-PLW commented 2 weeks ago

Yep, I would be able to do that this weekend.

stark4n6 commented 2 weeks ago

Yep, I would be able to do that this weekend.

I should have some time this afternoon to take a look, seemed simple enough to implement.