soarlab / maline

Android Malware Detection Framework
GNU Affero General Public License v3.0
81 stars 31 forks source link

Find intersection of data sets #12

Closed mdimjasevic closed 10 years ago

mdimjasevic commented 10 years ago

We would need a script that takes a list of directories with app .log files and finds which apps are in common across all the directories. Every .log file has a file name like this:

<num>-<apk-filename>-<app-name>-<timestamp>.log

A match between two .log files exists if they have the same <apk-filename>-<app-name> part. For example, there is a match between these two .log files:

100-004928e699609da4193131777e10cd2ce30c449031e37175ddfbed7ec8009598-com.jx.theme.n1117089725-2014-10-11-22-56-39.log
98-004928e699609da4193131777e10cd2ce30c449031e37175ddfbed7ec8009598-com.jx.theme.n1117089725-2014-10-14-23-20-58.log

The script needs to find apps that have a match across all the directories.

For example, if we had a file list-of-experiment-log-dirs with the following content:

/mnt/experiments/sp-500-wo-spoofing/android-logs
/mnt/experiments/sp-500-with-spoofing/android-logs
/mnt/experiments/sp-1000-wo-spoofing/android-logs
/mnt/experiments/sp-2000-wo-spoofing/android-logs
/mnt/experiments/sp-5000-wo-spoofing/android-logs

then we'd invoke the script like this

$ ./script-name list-of-experiment-log-dirs

It should write its output to the standard output, one app per line, in the following manner:

<apk-filename1>-<app-name1>
<apk-filename2>-<app-name2>
...