sepinf-inc / IPED

IPED Digital Forensic Tool. It is an open source software that can be used to process and analyze digital evidence, often seized at crime scenes by law enforcement or in a corporate investigation by private examiners.
Other
884 stars 209 forks source link

Multi valued item location map #2088

Open patrickdalla opened 4 months ago

patrickdalla commented 4 months ago

Implements support for items with multi valued locations metadata.

Highlight/selection are done based on item, i. e., when one item is highlighted all correspondent marks on map are highlighted, and one mark on map is highlighted all other marks of the same item are highlighted.

lfcnassif commented 1 month ago

Hi @patrickdalla, I'm starting to test this, thanks for this fix!

Highlight/selection are done based on item, i. e., when one item is highlighted all correspondent marks on map are highlighted, and one mark on map is highlighted all other marks of the same item are highlighted.

How is the expected behavior for checkboxes? When one item is checked on table, all locations should be checked on map? And if one or just a subset of the locations related to the same table item are checked on map, what should be the expected item state: checked or not?

patrickdalla commented 1 month ago

"How is the expected behavior for checkboxes? When one item is checked on table, all locations should be checked on map? " Yes.

"And if one or just a subset of the locations related to the same table item are checked on map, what should be the expected item state: checked or not?" Checked. All related subitems of the items that had at least one subitem as input of the check operation will be checked also.

As the check/uncheck is per item, all subitems on map will reflect the item state.

lfcnassif commented 1 month ago

Testing with one WhatsApp database, expanding single messages, 119 locations were plotted on the map. When disabling single message expansion, just 115 locations were plotted. I think the number should be the same, right?

lfcnassif commented 1 month ago

Testing with one WhatsApp database, expanding single messages, 119 locations were plotted on the map. When disabling single message expansion, just 115 locations were plotted. I think the number should be the same, right?

I didn't find the cause of this issue, but there is multithreaded code in GetResultsJSWorker class reading and writing to non thread safe variables, like itemsWithGPS and gpsItems, please take a look @patrickdalla. And maybe the Semaphore class usage might be avoided and replaced by a simpler synchronization approach.

patrickdalla commented 1 month ago

You are right! In fact, no real advantage is gained with this multi threading, they run almost in sequence. I made some structure to test multi threading, leaving the final parameters and code to run as if no multi thread was used, as I could not note any performance gain. But this code increases complexity, I will remove it and simplify.

patrickdalla commented 1 month ago

Testing with one WhatsApp database, expanding single messages, 119 locations were plotted on the map. When disabling single message expansion, just 115 locations were plotted. I think the number should be the same, right?

I found similar case. The whatsapp chat parsers extracted only 19 geolocations in metadata when extractMessages is false for one of the chats. But when extractMessages is true, it extracts 21 distinct instant messages georeferenced items. For this chat I could note that there are 2 pairs of IMs with same georeferences. Maybe is the whatsapp chat parser removes duplicates.

Check if it is also the same issue for your case. In this case it would not be an error of this PR.

Maybe it would not even be an issue of whatsapp parser, but a normal result. Wouldn't be lucene that removes these redundancies on same metadata?

wladimirleite commented 1 month ago

Maybe is the whatsapp chat parser removes duplicates.

As far as I know, the parser does not remove duplicated locations.

Maybe it would not even be an issue of whatsapp parser, but a normal result. Wouldn't be lucene that removes these redundancies on same metadata?

Duplicated values in the same item are explicitly removed by MetadataUtil.removeDuplicateValues(Metadata metadata). If there are duplicated locations in the same chat, the number of locations is expected to be different, right?! If extractMessages is enabled, locations are set in individual messages (in this case, duplicated locations will be assigned to different items). And if extractMessages is disabled, all locations belonging to a chat will be assigned to a single item, and later the duplicated ones will be removed.

lfcnassif commented 1 month ago

Duplicated values in the same item are explicitly removed by MetadataUtil.removeDuplicateValues(Metadata metadata). If there are duplicated locations in the same chat, the number of locations is expected to be different, right?! If extractMessages is enabled, locations are set in individual messages (in this case, duplicated locations will be assigned to different items). And if extractMessages is disabled, all locations belonging to a chat will be assigned to a single item, and later the duplicated ones will be removed.

Thanks @patrickdalla and @wladimirleite, that is the exact cause! I forgot the MetadataUtil.removeDuplicateValues(Metadata metadata) method (implemented by me...), disabling it for testing, the same number of locations were extracted with extractMessages on or off.

lfcnassif commented 1 month ago

Maybe is the whatsapp chat parser removes duplicates.

I checked this yesterday carefully and it doesn't happen, the cause is really what @wladimirleite pointed out. I think current behavior is fine. Sorry @patrickdalla for taking your time.

lfcnassif commented 1 month ago

"How is the expected behavior for checkboxes? When one item is checked on table, all locations should be checked on map? " Yes.

"And if one or just a subset of the locations related to the same table item are checked on map, what should be the expected item state: checked or not?" Checked. All related subitems of the items that had at least one subitem as input of the check operation will be checked also.

As the check/uncheck is per item, all subitems on map will reflect the item state. "How is the expected behavior for checkboxes? When one item is checked on table, all locations should be checked on map? " Yes.

"And if one or just a subset of the locations related to the same table item are checked on map, what should be the expected item state: checked or not?" Checked. All related subitems of the items that had at least one subitem as input of the check operation will be checked also.

As the check/uncheck is per item, all subitems on map will reflect the item state.

Unfortunately behavior described above is not working properly, see video below:

https://github.com/sepinf-inc/IPED/assets/7276994/aa8f97da-edc3-421f-9945-168b40fe64ed

lfcnassif commented 1 month ago

PS1: You can see the checked item on table but both locations unchecked on map at right side. PS2: When checking one location on map, the other, at the same place and related to the same item, is not checked. PS3: The balloon checkbox inconsistency also happens with expanded locations when extractMessages is true.