DEKRA-Cybersecurity / MAS-Preloaded-Apps-Scripts

MIT License
6 stars 2 forks source link

Proposal for changes in the project architecture #2

Closed ernstleierzopf closed 1 month ago

ernstleierzopf commented 7 months ago

I will use this thread to propose changes to the project. These are (or at least it should be after we agree upon it) ordered in the chronological order in which these changes should be implemented. Please comment in this issue if you agree/disagree. I suggest implementing these changes in smaller packages for easier review.

This issue should be adapted after steps of it were implemented to include more detail and answer occurring questions.

Checkboxes:

11 Very important would be to exactly define which tests were implemented (including the ID of the MASTG test. This should be seen in the file name. The file should include a comprehensive description of what part of the document was implemented and what the goal of the test is.

frandelpinodekra commented 6 months ago

Hi Ernst,

Thank you for your comments, we will add improvements as we go.

Regarding question 11, for an APK, if the NETWORK test cases appear as NA, it is because the application does not have any of the INTERNET, ACCESS_NETWORK_STATE or ACCESS_WIFI_STATE permissions.

We also wanted to ask you a question, in point 8 what exactly do you mean by dependencies?

ernstleierzopf commented 6 months ago

Hi Francisco,

thanks for answering my question. I would argue that the application not having the mentioned permissions should lead to a positive test result, because the vulnerability can not exist in it. What do you think?

With dependencies I mean the and tags within the Manifest files. These show which libraries are used by the APK.

We would also like to help with the implementation, because it is a lot of work.

lambdapioneer commented 6 months ago

Hey, my quick comments on the issues.

High-level comment: I would suggest we keep the DekraScript oblivious of how we structure the rest of the project (i.e. it does not need to know what vendor or firmware an APK is from). This accomplishes two important things: (i) the DekraScript stays flexible for other uses (UNIX tradition) and (ii) we don't need to synchronize project structure across to repositories.

Since we agree on most issues in general, do we want to transform this issue into a task list and discuss details in the individual issues?

  1. Obviously sounds good :)
  2. What timeout should aim for? I feel choosing something like 2xp95 would be a good start, but requires some quick measurements. Alternatively, we timeout on the whole process on a higher-level (i.e. our scripts).
  3. Maybe we can specify a sample CLI call and expected output structure? That way there is some sort of specification to work against. Personally, I am a big fan of all things examples and sample code. I added a suggestion below.
  4. Not sure if "hash" is the right word here. Maybe uniquely name since you seem to use an UUID in your example? Also, this would require a bit more details on how the input structure looks like... maybe this is better left for the higher-level logic that calls the DekraScripts? Then we leave the tool here flexible.
  5. See point 3 re/ sample input/output.
  6. I probably would leave the formula calculation entirely to the evaluation phase. That separates concerns and allows for easier experimentation with the parameters.
  7. That seems then to bake a lot of assumptions into the tools of this repository. Maybe that logic is better kept in the higher-level project?
  8. Generally agree. I think this can all be captured in an input/output as an easy specification.
  9. (missing)
  10. Strongly agree :) I think the best would be to simply wrap the current calls in a simple data layer and then exchange the implementation underneath.
  11. (no comment)

Regarding the last post: I am not sure what you mean with "tags within the Manifest files". Figuring that out for APK files is a quite hard problem indeed (see e.g. LibID: https://dl.acm.org/doi/pdf/10.1145/3293882.3330563)

Sample (just as a starting point, please copy-paste and share your variant)

Input file structure (keep in mind that the caller can name the sample.apk as it likes, the script does not need to know):

DekraScripts/
sample.apk

Call the single-APK script (keep in mind that the caller can name the output-folder as it likes, the script does not need to know):

$ python3 analyze_apk.py ../sample.apk -o ../output-folder

Output file structure:

DekraScripts/
sample.apk
output-folder/permissions.csv
output-folder/test-results.csv
output-folder/logging.txt

Output in output-folder/permission.csv:

permission;comment;
INTERNET;;
READ_STORAGE;;

Output in output-folder/test-results.csv (note that I mention both the MASTG test ID and the check name, this is because we might do multiple independent checks against the same test ID; the evaluation can make sense of it later):

test;check;result;comment;
MASTG-TEST-0013;check-for-hard-coded-keys;PASS;checked 1000 files
MASTG-TEST-0013;check-for-old-ciphers;PASS;checked 1337 files
MASTG-TEST-0016;check-for-non-secure-random;FAIL;found 2 failures;
MASTG-TEST-0015;check-for-key-resuse;NA;ignore because of test error

Output in output-folder/logging.txt:

...
20230101T1100 [Test0016.py] INFO Found non secure random in file with SHA call: CustomCrypto.java:111
20230101T1100 [Test0016.py] INFO Found non secure random in file with AES call: MyEncryption.java:42
20230101T1100 [Test0015.py] WARN The data flow analysis threw an error: "cyclic dependency detected, bail"
...

As said, this is just an example. And please copy-paste the next iteration below :)

ernstleierzopf commented 6 months ago

Hey, my quick comments on the issues.

High-level comment: I would suggest we keep the DekraScript oblivious of how we structure the rest of the project (i.e. it does not need to know what vendor or firmware an APK is from). This accomplishes two important things: (i) the DekraScript stays flexible for other uses (UNIX tradition) and (ii) we don't need to synchronize project structure across to repositories.

sounds good.

Since we agree on most issues in general, do we want to transform this issue into a task list and discuss details in the individual issues?

yes I will create atomar tasksr/issues which can be implemented on its own. However, I would leave the numbering as it is easier to mention in comments - added a checkbox above which will be extended with the issues.

  1. Obviously sounds good :)
  2. What timeout should aim for? I feel choosing something like 2xp95 would be a good start, but requires some quick measurements. Alternatively, we timeout on the whole process on a higher-level (i.e. our scripts).

by test 300 seconds are a good value.

  1. Maybe we can specify a sample CLI call and expected output structure? That way there is some sort of specification to work against. Personally, I am a big fan of all things examples and sample code. I added a suggestion below.

Good idea, but I do not have the answers on what the "best" structure is - this needs to be discussed when we are at that point.

  1. Not sure if "hash" is the right word here. Maybe uniquely name since you seem to use an UUID in your example? Also, this would require a bit more details on how the input structure looks like... maybe this is better left for the higher-level logic that calls the DekraScripts? Then we leave the tool here flexible.

Here I meant that the content of that file should be the hashes of each APK of the tested firmware image.

  1. See point 3 re/ sample input/output.

agree.

  1. I probably would leave the formula calculation entirely to the evaluation phase. That separates concerns and allows for easier experimentation with the parameters.

Yes, that is how I meant it to be. It should be completely separate and does not need to be called within the main processing task. I would leave this functionality in the Dekra project (but as a separate script)

  1. That seems then to bake a lot of assumptions into the tools of this repository. Maybe that logic is better kept in the higher-level project?

good point. But our higher-level project is still not open-source, so we could also keep it here (at least for now). It does not hurt the flexibility of the tool as you mentioned in your comment above.

  1. Generally agree. I think this can all be captured in an input/output as an easy specification.
  2. (missing)
  3. Strongly agree :) I think the best would be to simply wrap the current calls in a simple data layer and then exchange the implementation underneath.
  4. (no comment)

Regarding the last post: I am not sure what you mean with "tags within the Manifest files". Figuring that out for APK files is a quite hard problem indeed (see e.g. LibID: https://dl.acm.org/doi/pdf/10.1145/3293882.3330563)

Oh, I did not know that. Thought these libraries are simply stated in the Manifest file. Skip this step, if not easily possible.

Sample (just as a starting point, please copy-paste and share your variant)

Input file structure (keep in mind that the caller can name the sample.apk as it likes, the script does not need to know):

DekraScripts/
sample.apk

Call the single-APK script (keep in mind that the caller can name the output-folder as it likes, the script does not need to know):

$ python3 analyze_apk.py ../sample.apk -o ../output-folder

Output file structure:

DekraScripts/
sample.apk
output-folder/permissions.csv
output-folder/test-results.csv
output-folder/logging.txt

Output in output-folder/permission.csv:

permission;comment;
INTERNET;;
READ_STORAGE;;

Output in output-folder/test-results.csv (note that I mention both the MASTG test ID and the check name, this is because we might do multiple independent checks against the same test ID; the evaluation can make sense of it later):

test;check;result;comment;
MASTG-TEST-0013;check-for-hard-coded-keys;PASS;checked 1000 files
MASTG-TEST-0013;check-for-old-ciphers;PASS;checked 1337 files
MASTG-TEST-0016;check-for-non-secure-random;FAIL;found 2 failures;
MASTG-TEST-0015;check-for-key-resuse;NA;ignore because of test error

Output in output-folder/logging.txt:

...
20230101T1100 [Test0016.py] INFO Found non secure random in file with SHA call: CustomCrypto.java:111
20230101T1100 [Test0016.py] INFO Found non secure random in file with AES call: MyEncryption.java:42
20230101T1100 [Test0015.py] WARN The data flow analysis threw an error: "cyclic dependency detected, bail"
...

As said, this is just an example. And please copy-paste the next iteration below :)

luisadnsaavedra commented 6 months ago

Regarding the "tags" from the ManifestFile, I think you are both right, there are things we can obtain as we did in the ModZoo project directly from the ManifestFile. But in order to know specific versions of libraries and get a full list of libraries we would need a tool like LibID which has exponential costs and is not scalable beyond a few libraries.

Again, this is also something that could be kept separate from the Dekra tools if it simply outputs the Manifest file, or could be incorporated.

ernstleierzopf commented 6 months ago

Regarding the "tags" from the ManifestFile, I think you are both right, there are things we can obtain as we did in the ModZoo project directly from the ManifestFile. But in order to know specific versions of libraries and get a full list of libraries we would need a tool like LibID which has exponential costs and is not scalable beyond a few libraries.

Again, this is also something that could be kept separate from the Dekra tools if it simply outputs the Manifest file, or could be incorporated.

Wow this is exactly what I was searching for all the time! We definitely need to include this in our pipeline. I am not sure, but we could let it run per app/library and compare new files against that profile - obviously we can skip files with the same hash. Needs some work to be done in advance, but some of it we also need to do for the Dekra scripts (like storing the hash of the APK).

Did you test this tool in your previous work?

ernstleierzopf commented 3 months ago

Regarding our meeting today, I want to answer the question of how the results of the analysis should be stored. To allow us running the analysis only once per app, we need to separate all of the steps.

Therefore we first hash all of the apps and store the hashes in one file (the hash at the end is currently used from the results of the script which seems to be the UUID? We should use the hash of the whole image file (zip)).

analyze_apk.py script runs all tests and writes the results into the four different files. These files will have results for all apk files, so we are able to copy the interesting lines to a separate result directory in main.py. This script also should check if the apk was already analyzed (by searching for the hash in the test-results.csv file).

The calculate_formula.py script reads all of the previously calculated hashes and gets the necessary results from test-results.csv (and if necessary from other files). Then it calculates the formula and outputs the configuration from config/methods_config.yml in some form.

main.py combines all of the above scripts and merges the results in image-specific results only containing the apk files from that specific image. This means that it copies the affected result lines from all of the result files into a separate directory for that image.

If we want to change the permission configurations, we can simply run calculate_formula.py with the hashes file as parameter and store the new results in the same directory as the hashes file (with new timestamp for different results).

luisadnsaavedra commented 2 months ago

Regarding the "tags" from the ManifestFile, I think you are both right, there are things we can obtain as we did in the ModZoo project directly from the ManifestFile. But in order to know specific versions of libraries and get a full list of libraries we would need a tool like LibID which has exponential costs and is not scalable beyond a few libraries. Again, this is also something that could be kept separate from the Dekra tools if it simply outputs the Manifest file, or could be incorporated.

Wow this is exactly what I was searching for all the time! We definitely need to include this in our pipeline. I am not sure, but we could let it run per app/library and compare new files against that profile - obviously we can skip files with the same hash. Needs some work to be done in advance, but some of it we also need to do for the Dekra scripts (like storing the hash of the APK).

Did you test this tool in your previous work?

Sorry for the late reply I somehow missed this. Yes, we tested this tool together with the original authors. Its runtime increases exponentially with the number of libraries (and versions) you want to test against. I.e. it is great to check for the presence of a handful or so library versions that might be compromised etc. but very slow if you wanted to use it for finding which libraries an app uses out of hundreds (like we did at the time).

So we can use it for our project if we are careful about the scope.

ernstleierzopf commented 2 months ago

Regarding the "tags" from the ManifestFile, I think you are both right, there are things we can obtain as we did in the ModZoo project directly from the ManifestFile. But in order to know specific versions of libraries and get a full list of libraries we would need a tool like LibID which has exponential costs and is not scalable beyond a few libraries. Again, this is also something that could be kept separate from the Dekra tools if it simply outputs the Manifest file, or could be incorporated.

Wow this is exactly what I was searching for all the time! We definitely need to include this in our pipeline. I am not sure, but we could let it run per app/library and compare new files against that profile - obviously we can skip files with the same hash. Needs some work to be done in advance, but some of it we also need to do for the Dekra scripts (like storing the hash of the APK). Did you test this tool in your previous work?

Sorry for the late reply I somehow missed this. Yes, we tested this tool together with the original authors. Its runtime increases exponentially with the number of libraries (and versions) you want to test against. I.e. it is great to check for the presence of a handful or so library versions that might be compromised etc. but very slow if you wanted to use it for finding which libraries an app uses out of hundreds (like we did at the time).

So we can use it for our project if we are careful about the scope.

No problem, happy to include this tool if we find some interesting filters for the versions/apps we want to analyze.

ernstleierzopf commented 2 months ago

@frandelpinodekra @jmariasantosdekra @noeguedek please check out the new version of the changes. I removed non-important changes and made the expected behavior of the scripts more clear. Now there is an exact specification of which parameters and results the scripts should have.

If you want, I can help you with creating these scripts. Just push the current changes to git and let me know that I can work on it (by working in my own fork and creating a pull request).