sepinf-inc / IPED

IPED Digital Forensic Tool. It is an open source software that can be used to process and analyze digital evidence, often seized at crime scenes by law enforcement or in a corporate investigation by private examiners.
Other
884 stars 209 forks source link

Increased Volume Found after --continue #2257

Open aberenguel opened 3 days ago

aberenguel commented 3 days ago

I'm processing a dd file with master branch (profile pedo, enableOCR and enableFaceRecognition). During the process, the process aborted, so I re-exectuted with --continue. Then I noticed that the Volume Found (Volume Descoberto) increased.

image TimePhoto_20240701_125326

lfcnassif commented 3 days ago

Thanks @aberenguel for reporting. It shouldn't happen, possibly it is related to #telegram

A few weeks ago I noticed some items were receiving a different trackId when resuming (--continue), when it should be the same, so already processed items could be identified and ignored. I still didn't investigate the root cause, but that would explain the issue you reported.

Are you able to share a small image and provide the aborting point to resume to reproduce the issue?

lfcnassif commented 3 days ago

Do you have Telegram databases in this case? @hauck-jvsh does Telegram parser extract subitems always in the same order? This is needed for --continue to work properly, since the subitem number/position is used in trackID computation. If a HashSet or HashMap, for example, is used somewhere into the parser to store subitems to be extracted, that may be the cause, just a blind guess...

lfcnassif commented 3 days ago

Or maybe other parsers are affected by the hypothesis above...

hauck-jvsh commented 3 days ago

I never thought about this, and never made any test to check if the items are being extracted in the same order.

lfcnassif commented 3 days ago

Don't worry, I never gave this recommendation to contributors and just realized that might be the issue today.

aberenguel commented 2 days ago

I think it is not related to Telegram. This is the case result: image

aberenguel commented 2 days ago

Are you able to share a small image and provide the aborting point to resume to reproduce the issue?

The trigger image has 1TB. I'll try to reproduce with another image. Another test I'm going to do is to close the case when sleuth.db is complete e to resume with --continue.

aberenguel commented 2 days ago

I was able to reproduce killing the java processes with kill -9 and resuming the processing with --continue.

lfcnassif commented 2 days ago

Are you able to identify which files/artifacts were duplicated in the case? Look at the trackId property in Metadata filter panel, each value should occur just for 1 file. If it is not duplicated, maybe looking for files with same hashes and same paths may help to find the duplicated artifacts. And actually there is a very little chance nothing was duplicated, but it was just a minor bug in the Total Volume counter.

lfcnassif commented 1 day ago

I took a look at --continue related code and seems it is taking into account the size of items that are containers AND subitems/carved items in the total volume count to process, while standard processing doesn't. So maybe this is just a minor issue with the volume count...