I will use this thread to propose changes to the project. These are (or at least it should be after we agree upon it) ordered in the chronological order in which these changes should be implemented. Please comment in this issue if you agree/disagree. I suggest implementing these changes in smaller packages for easier review.

This issue should be adapted after steps of it were implemented to include more detail and answer occurring questions.

Checkboxes:

[x] 1. fix Xiaomi issue #1
[x] 2. #3
[x] 3. #4 remove scripts which will not be needed after restructuring the project. The only task of this project is to 1) analyze one or multiple APK files and 2) calculate the security scoring formula. The extraction tool for firmware images is open source at https://github.com/ernstleierzopf/AndScanner and it does not need a dockerfile to extract a single firmware image. The mentioned files include:
run.sh (include instruction on how to run the software in readme.md - will be changed anyways)
dockerfile_*
data directory
move submodules/TestCasesScripts/* to root directory of the project and remove these directories
[ ] 4. #5 create small script to hash all APKs from one firmware image hash_apks.py. Example usage: python3 hash_apks.py "data/apks/husky-uq1a.240105.004-factory-498499a8" "data/apk_hashes.csv"
[ ] 5. #6 create script to analyze a single APK file analyze_apk.py. Example usage: python3 analyze_apk.py "data/apks/husky-uq1a.240105.004-factory-498499a8/YouTube" "data/results/all_apk_hashes.csv" -> test if hash of .apk file is found in all_apk_hashes.csv. Yes -> exit. No -> run all tests and append to files with all results (see #9). Hint: this uses the result directory (in this case data/results) and writes/creates the result files.
[ ] 6. #7 create separate script calculate_formula.py to calculate the formula results using the configured permission weights in config/methods_config.yml and the list of app hashes created in hash_apks.py. Example usage: python3 calculate_formula.py "config/methods_config.yml" "data/apks/husky-uq1a.240105.004-factory-498499a8" "data/apk_hashes.csv" "data/results". Hint: this script uses the same files created/written by the analyze_apk.py. Creates formula.csv.
[ ] 7. create script extract_results.py to extract all results from the merged results (in this case in data/results). Example usage: python3 extract_results.py "data/apk_hashes.csv" "data/results" "data/results/husky-1725366134_uq1a.240105.004-factory-498499a8_MD5-HASH-OF-FW"
[ ] 7. #8 change main.py to do all of the previous steps, namely hash all extracted APKs from the firmware image, iterate through all of them using analyze_apk.py, and calculate the formula (append the result to the formula.csv file). main.py must have all of the necessary arguments for all of the python scripts. The main.py file should use the resulting files to create one directory per firmware image with the interesting data to allow for efficient analysis.
[ ] 8. #9 ~~store interesting additional information of APKs such as the list of permissions and dependencies (including version?).~~ rework the structure of the result files. ~~- implement proper logging for app analysis (use logging module and have one total file for all analysis. Distinction can be made with the APK hash)~~ - include UNIX timestamp in each line of the analysis (these are better than dates, because if a human wants to read them, it is easily possible to let Excel, LibreOffice convert these timestamps to dates.)
have four result files: test-results.csv (currently Report), total-fail-counts.csv, findings.csv, formula.csv (formula.csv includes name of vendor, firmware image name, hash of firmware image, some kind of automated encoding of the configured permission/weights (base64 encoding? - easier for comparison, worst case just include the yml data from the config), and score -> these files should be the default value for the analyze_apk.py script. ~~- [ ] 9. previously missing~~ ~~- [ ] 10. #10 remove all database usage in the project. Extract data only to human readable (text, csv or other fitting formats) files.~~
[ ] 11. Network Tests seem to not be available (NA) in any of the runs. Does this only work on real devices or why does the test not work?

11 Very important would be to exactly define which tests were implemented (including the ID of the MASTG test. This should be seen in the file name. The file should include a comprehensive description of what part of the document was implemented and what the goal of the test is.

Hi Ernst,

Thank you for your comments, we will add improvements as we go.

Regarding question 11, for an APK, if the NETWORK test cases appear as NA, it is because the application does not have any of the INTERNET, ACCESS_NETWORK_STATE or ACCESS_WIFI_STATE permissions.

We also wanted to ask you a question, in point 8 what exactly do you mean by dependencies?

Hi Francisco,

thanks for answering my question. I would argue that the application not having the mentioned permissions should lead to a positive test result, because the vulnerability can not exist in it. What do you think?

With dependencies I mean the and tags within the Manifest files. These show which libraries are used by the APK.

We would also like to help with the implementation, because it is a lot of work.

Hey, my quick comments on the issues.

High-level comment: I would suggest we keep the DekraScript oblivious of how we structure the rest of the project (i.e. it does not need to know what vendor or firmware an APK is from). This accomplishes two important things: (i) the DekraScript stays flexible for other uses (UNIX tradition) and (ii) we don't need to synchronize project structure across to repositories.

Since we agree on most issues in general, do we want to transform this issue into a task list and discuss details in the individual issues?

Obviously sounds good :)
What timeout should aim for? I feel choosing something like 2xp95 would be a good start, but requires some quick measurements. Alternatively, we timeout on the whole process on a higher-level (i.e. our scripts).
Maybe we can specify a sample CLI call and expected output structure? That way there is some sort of specification to work against. Personally, I am a big fan of all things examples and sample code. I added a suggestion below.
Not sure if "hash" is the right word here. Maybe uniquely name since you seem to use an UUID in your example? Also, this would require a bit more details on how the input structure looks like... maybe this is better left for the higher-level logic that calls the DekraScripts? Then we leave the tool here flexible.
See point 3 re/ sample input/output.
I probably would leave the formula calculation entirely to the evaluation phase. That separates concerns and allows for easier experimentation with the parameters.
That seems then to bake a lot of assumptions into the tools of this repository. Maybe that logic is better kept in the higher-level project?
Generally agree. I think this can all be captured in an input/output as an easy specification.
(missing)
Strongly agree :) I think the best would be to simply wrap the current calls in a simple data layer and then exchange the implementation underneath.
(no comment)

Regarding the last post: I am not sure what you mean with "tags within the Manifest files". Figuring that out for APK files is a quite hard problem indeed (see e.g. LibID: https://dl.acm.org/doi/pdf/10.1145/3293882.3330563)

Sample (just as a starting point, please copy-paste and share your variant)

Input file structure (keep in mind that the caller can name the sample.apk as it likes, the script does not need to know):

DekraScripts/
sample.apk

Call the single-APK script (keep in mind that the caller can name the output-folder as it likes, the script does not need to know):

$ python3 analyze_apk.py ../sample.apk -o ../output-folder

Output file structure:

DekraScripts/
sample.apk
output-folder/permissions.csv
output-folder/test-results.csv
output-folder/logging.txt

Output in output-folder/permission.csv:

permission;comment;
INTERNET;;
READ_STORAGE;;

Output in output-folder/test-results.csv (note that I mention both the MASTG test ID and the check name, this is because we might do multiple independent checks against the same test ID; the evaluation can make sense of it later):

test;check;result;comment;
MASTG-TEST-0013;check-for-hard-coded-keys;PASS;checked 1000 files
MASTG-TEST-0013;check-for-old-ciphers;PASS;checked 1337 files
MASTG-TEST-0016;check-for-non-secure-random;FAIL;found 2 failures;
MASTG-TEST-0015;check-for-key-resuse;NA;ignore because of test error

Output in output-folder/logging.txt:

...
20230101T1100 [Test0016.py] INFO Found non secure random in file with SHA call: CustomCrypto.java:111
20230101T1100 [Test0016.py] INFO Found non secure random in file with AES call: MyEncryption.java:42
20230101T1100 [Test0015.py] WARN The data flow analysis threw an error: "cyclic dependency detected, bail"
...

As said, this is just an example. And please copy-paste the next iteration below :)

Hey, my quick comments on the issues.

High-level comment: I would suggest we keep the DekraScript oblivious of how we structure the rest of the project (i.e. it does not need to know what vendor or firmware an APK is from). This accomplishes two important things: (i) the DekraScript stays flexible for other uses (UNIX tradition) and (ii) we don't need to synchronize project structure across to repositories.

sounds good.

Since we agree on most issues in general, do we want to transform this issue into a task list and discuss details in the individual issues?

yes I will create atomar tasksr/issues which can be implemented on its own. However, I would leave the numbering as it is easier to mention in comments - added a checkbox above which will be extended with the issues.

Obviously sounds good :)

What timeout should aim for? I feel choosing something like 2xp95 would be a good start, but requires some quick measurements. Alternatively, we timeout on the whole process on a higher-level (i.e. our scripts).

by test 300 seconds are a good value.

Maybe we can specify a sample CLI call and expected output structure? That way there is some sort of specification to work against. Personally, I am a big fan of all things examples and sample code. I added a suggestion below.

Good idea, but I do not have the answers on what the "best" structure is - this needs to be discussed when we are at that point.

Not sure if "hash" is the right word here. Maybe uniquely name since you seem to use an UUID in your example? Also, this would require a bit more details on how the input structure looks like... maybe this is better left for the higher-level logic that calls the DekraScripts? Then we leave the tool here flexible.

Here I meant that the content of that file should be the hashes of each APK of the tested firmware image.

See point 3 re/ sample input/output.

agree.

I probably would leave the formula calculation entirely to the evaluation phase. That separates concerns and allows for easier experimentation with the parameters.

Yes, that is how I meant it to be. It should be completely separate and does not need to be called within the main processing task. I would leave this functionality in the Dekra project (but as a separate script)

That seems then to bake a lot of assumptions into the tools of this repository. Maybe that logic is better kept in the higher-level project?

good point. But our higher-level project is still not open-source, so we could also keep it here (at least for now). It does not hurt the flexibility of the tool as you mentioned in your comment above.

Generally agree. I think this can all be captured in an input/output as an easy specification.

(missing)

Strongly agree :) I think the best would be to simply wrap the current calls in a simple data layer and then exchange the implementation underneath.

(no comment)

Regarding the last post: I am not sure what you mean with "tags within the Manifest files". Figuring that out for APK files is a quite hard problem indeed (see e.g. LibID: https://dl.acm.org/doi/pdf/10.1145/3293882.3330563)

Oh, I did not know that. Thought these libraries are simply stated in the Manifest file. Skip this step, if not easily possible.

Sample (just as a starting point, please copy-paste and share your variant)

Input file structure (keep in mind that the caller can name the sample.apk as it likes, the script does not need to know):
DekraScripts/
sample.apk
Call the single-APK script (keep in mind that the caller can name the output-folder as it likes, the script does not need to know):
$ python3 analyze_apk.py ../sample.apk -o ../output-folder
Output file structure:
DekraScripts/
sample.apk
output-folder/permissions.csv
output-folder/test-results.csv
output-folder/logging.txt
Output in output-folder/permission.csv:
permission;comment;
INTERNET;;
READ_STORAGE;;
Output in output-folder/test-results.csv (note that I mention both the MASTG test ID and the check name, this is because we might do multiple independent checks against the same test ID; the evaluation can make sense of it later):
test;check;result;comment;
MASTG-TEST-0013;check-for-hard-coded-keys;PASS;checked 1000 files
MASTG-TEST-0013;check-for-old-ciphers;PASS;checked 1337 files
MASTG-TEST-0016;check-for-non-secure-random;FAIL;found 2 failures;
MASTG-TEST-0015;check-for-key-resuse;NA;ignore because of test error
Output in output-folder/logging.txt:
...
20230101T1100 [Test0016.py] INFO Found non secure random in file with SHA call: CustomCrypto.java:111
20230101T1100 [Test0016.py] INFO Found non secure random in file with AES call: MyEncryption.java:42
20230101T1100 [Test0015.py] WARN The data flow analysis threw an error: "cyclic dependency detected, bail"
...
As said, this is just an example. And please copy-paste the next iteration below :)

Regarding the "tags" from the ManifestFile, I think you are both right, there are things we can obtain as we did in the ModZoo project directly from the ManifestFile. But in order to know specific versions of libraries and get a full list of libraries we would need a tool like LibID which has exponential costs and is not scalable beyond a few libraries.

Again, this is also something that could be kept separate from the Dekra tools if it simply outputs the Manifest file, or could be incorporated.

Regarding the "tags" from the ManifestFile, I think you are both right, there are things we can obtain as we did in the ModZoo project directly from the ManifestFile. But in order to know specific versions of libraries and get a full list of libraries we would need a tool like LibID which has exponential costs and is not scalable beyond a few libraries.

Again, this is also something that could be kept separate from the Dekra tools if it simply outputs the Manifest file, or could be incorporated.

Wow this is exactly what I was searching for all the time! We definitely need to include this in our pipeline. I am not sure, but we could let it run per app/library and compare new files against that profile - obviously we can skip files with the same hash. Needs some work to be done in advance, but some of it we also need to do for the Dekra scripts (like storing the hash of the APK).

Did you test this tool in your previous work?

Regarding our meeting today, I want to answer the question of how the results of the analysis should be stored. To allow us running the analysis only once per app, we need to separate all of the steps.

Therefore we first hash all of the apps and store the hashes in one file (the hash at the end is currently used from the results of the script which seems to be the UUID? We should use the hash of the whole image file (zip)).

analyze_apk.py script runs all tests and writes the results into the four different files. These files will have results for all apk files, so we are able to copy the interesting lines to a separate result directory in main.py. This script also should check if the apk was already analyzed (by searching for the hash in the test-results.csv file).

The calculate_formula.py script reads all of the previously calculated hashes and gets the necessary results from test-results.csv (and if necessary from other files). Then it calculates the formula and outputs the configuration from config/methods_config.yml in some form.

main.py combines all of the above scripts and merges the results in image-specific results only containing the apk files from that specific image. This means that it copies the affected result lines from all of the result files into a separate directory for that image.

If we want to change the permission configurations, we can simply run calculate_formula.py with the hashes file as parameter and store the new results in the same directory as the hashes file (with new timestamp for different results).

Regarding the "tags" from the ManifestFile, I think you are both right, there are things we can obtain as we did in the ModZoo project directly from the ManifestFile. But in order to know specific versions of libraries and get a full list of libraries we would need a tool like LibID which has exponential costs and is not scalable beyond a few libraries. Again, this is also something that could be kept separate from the Dekra tools if it simply outputs the Manifest file, or could be incorporated.

Wow this is exactly what I was searching for all the time! We definitely need to include this in our pipeline. I am not sure, but we could let it run per app/library and compare new files against that profile - obviously we can skip files with the same hash. Needs some work to be done in advance, but some of it we also need to do for the Dekra scripts (like storing the hash of the APK).

Did you test this tool in your previous work?

Sorry for the late reply I somehow missed this. Yes, we tested this tool together with the original authors. Its runtime increases exponentially with the number of libraries (and versions) you want to test against. I.e. it is great to check for the presence of a handful or so library versions that might be compromised etc. but very slow if you wanted to use it for finding which libraries an app uses out of hundreds (like we did at the time).

So we can use it for our project if we are careful about the scope.

Regarding the "tags" from the ManifestFile, I think you are both right, there are things we can obtain as we did in the ModZoo project directly from the ManifestFile. But in order to know specific versions of libraries and get a full list of libraries we would need a tool like LibID which has exponential costs and is not scalable beyond a few libraries. Again, this is also something that could be kept separate from the Dekra tools if it simply outputs the Manifest file, or could be incorporated.

Wow this is exactly what I was searching for all the time! We definitely need to include this in our pipeline. I am not sure, but we could let it run per app/library and compare new files against that profile - obviously we can skip files with the same hash. Needs some work to be done in advance, but some of it we also need to do for the Dekra scripts (like storing the hash of the APK). Did you test this tool in your previous work?

Sorry for the late reply I somehow missed this. Yes, we tested this tool together with the original authors. Its runtime increases exponentially with the number of libraries (and versions) you want to test against. I.e. it is great to check for the presence of a handful or so library versions that might be compromised etc. but very slow if you wanted to use it for finding which libraries an app uses out of hundreds (like we did at the time).

So we can use it for our project if we are careful about the scope.

No problem, happy to include this tool if we find some interesting filters for the versions/apps we want to analyze.

@frandelpinodekra @jmariasantosdekra @noeguedek please check out the new version of the changes. I removed non-important changes and made the expected behavior of the scripts more clear. Now there is an exact specification of which parameters and results the scripts should have.

If you want, I can help you with creating these scripts. Just push the current changes to git and let me know that I can work on it (by working in my own fork and creating a pull request).

DEKRA-Cybersecurity / MAS-Preloaded-Apps-Scripts

Proposal for changes in the project architecture #2

11 Very important would be to exactly define which tests were implemented (including the ID of the MASTG test. This should be seen in the file name. The file should include a comprehensive description of what part of the document was implemented and what the goal of the test is.