Closed wladimirleite closed 2 months ago
It finally finished (took more than 16 hours). I will test now with the changes that I made.
Connected to database d:\iped-hashes.db
Database tables and indexes created.
Last HASH_ID = 0
Last PROPERTY_ID = 0
Properties loaded = 0
Reading NSRL_DB file g:\RDS_2024.03.1_android.db...
97148886 records read in 59027 seconds.
29406404 hashes inserted.
36897886 hashes updated.
20988875 hashes were already in the database.
5614548 zero length hashes were ignored.
4241173 records combined.
Commiting changes...
Commit completed in 6 seconds.
Full version importing is much faster now.
Full: 16 hours -> 24 minutes
Minimal: 23 minutes -> 21 minutes
uauuu, Excelent ! I want to test it. what was the magic?
The impact this made is incredible.
I tested with the android full and finished in 24 minutes.
I tested again, this time from SSD to SSD and comparing the tests using RAM, there wasn't much of a difference. Now the base is complete and the results were: RDS_2024.03.1_android - 25" RDS_2024.03.1_ios - 52" RDS_2024.03.1_legacy - 2:19" RDS_2024.03.1_modern - 5:29"
Thank you for your help
Thanks @paulobreim for reporting and testing this issue. One last comment... For the "modern" hash set, I believe that the "minimal" version would be the one with the most noticeable difference (in terms of importing time) compared to the "full" version.
I tested the time difference in using the hash base. Performed on an image of a Samsung SM-G780G, obtained by cellebrite, which generated the files below.
17/08/2023 12:26 4.068.589.593 EvidenceCollection_2023-08-17_Report.ufdr 17/08/2023 12:06 4.928.307.200 EvidenceCollection_2023-08-17_Report.z01 17/08/2023 12:07 4.928.307.200 EvidenceCollection_2023-08-17_Report.z02 17/08/2023 12:07 4.928.307.200 EvidenceCollection_2023-08-17_Report.z03 17/08/2023 12:08 4.928.307.200 EvidenceCollection_2023-08-17_Report.z04 17/08/2023 12:08 4.928.307.200 EvidenceCollection_2023-08-17_Report.z05 17/08/2023 12:09 4.928.307.200 EvidenceCollection_2023-08-17_Report.z06 17/08/2023 12:09 4.928.307.200 EvidenceCollection_2023-08-17_Report.z07 17/08/2023 12:10 4.928.307.200 EvidenceCollection_2023-08-17_Report.z08 17/08/2023 12:11 4.928.307.200 EvidenceCollection_2023-08-17_Report.z09 17/08/2023 12:11 4.928.307.200 EvidenceCollection_2023-08-17_Report.z10 17/08/2023 12:12 4.928.307.200 EvidenceCollection_2023-08-17_Report.z11 17/08/2023 12:12 4.928.307.200 EvidenceCollection_2023-08-17_Report.z12 17/08/2023 12:13 4.928.307.200 EvidenceCollection_2023-08-17_Report.z13 17/08/2023 12:14 4.928.307.200 EvidenceCollection_2023-08-17_Report.z14 17/08/2023 12:24 4.928.307.200 EvidenceCollection_2023-08-17_Report.z15
Processing time without using iped-hashes.db 40 minutes. Processing time using iped-hashes.db 21 minutes.
Both generated 435,610 items in IPED, but what caught my attention is that in the IPED Evidences item, the item EvidenceCollection_2023-08-17_Report.z01 does not appear. I don't know if this is correct or not.
paulo
As discussed in https://github.com/sepinf-inc/IPED/discussions/2155.
I am finishing a test importing a "full" (not "minimal") version of the latest NSRL Android SQLite. It is taking a lot of time (~10 hours so far, 81% processed). I will update here the final numbers. Although it should be faster than @paulobreim observed in his environment (probably there is something else going on there), it is way too slow ("minimal" version was imported in 23 minutes in my PC, and it is only ~14% smaller than the "full" version, comparing the SQLite file length).
Good news is that I found a simple way to speed it up, specific for NSRL SQLite files processing. The gain with the "minimal" version should be small, but it should make quite some difference for the full version.
As I mentioned in the discussion, regardless of a possible optimization, I recommend using "minimal" versions (at least for IPED's usage).