Closed ZetaTwo closed 1 year ago
Hi @ZetaTwo. that is helpful information, but as I do lack experience, could you elaborate a bit more on which file to start from or how to get started with exact locations or file_names.
Also let me know, if you are working on anything!
If you follow the reproduction instructions you will see where in the code the error is thrown and you can start looking there.
Ah okay, I'll give it a shot and update you soon on it! Thanks
@ZetaTwo I noticed that this issue has been idle for some time, and after looking into it, I believe I have identified the root cause and a potential solution. I'm really interested in contributing to this project and would be delighted to take ownership of this issue.
Based on my investigation, it appears that creds = service_account.Credentials.from_service_account_file(file_path)
inside the method get_creds_from_file
raises exception when the json file is not valid.
A possible solution that i thought of is as follows:
In the method get_creds_from_file
From what i could find, we do have some formats that each field in the json file is supposed to follow link: create and delete service account keys
"type": "service_account",
"project_id": "PROJECT_ID",
"private_key_id": "KEY_ID",
"private_key": "-----BEGIN PRIVATE KEY-----\nPRIVATE_KEY\n-----END PRIVATE KEY-----\n",
"client_email": "SERVICE_ACCOUNT_EMAIL",
"client_id": "CLIENT_ID",
"auth_uri": "https://accounts.google.com/o/oauth2/auth",
"token_uri": "https://accounts.google.com/o/oauth2/token",
"auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
"client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/SERVICE_ACCOUNT_EMAIL"
}
Please let me know if i am missing something.
I think you misunderstand the issue here. We don't need to do more validation of the key. That is already handled by the from_service_account_file
function.
The problem is that the scanner stops as soon as it encounters an invalid key and it doesn't tell which key is invalid. It would be better if it tried to parse all keys and then provided one single list of all the keys that could not be parsed.
@ZetaTwo Thanks for the clarification. From what i found, the scanner stops even if the file is not json. We get the exception from the from_service_account_file
.
I tried a couple of scenarios:
Scenario 1: remove double string from key "project_id": "PROJECT_ID" > changed it to "project: "PROJECT_ID"
Scenario 2: remove hypen from private key "private_key": "-----BEGIN....." > changed it to ""private_key": "BEGIN"
In both cases, the scanner crashes without proper error message.
This is what i think could work:
First we should try to see if the file is json and only then we try to parse the keys to find the problematic ones.
For the invalid json file we can inform the user to provde the valid json file and leave it up to the user to provide the correct json file.
For key parsing, i thought to use the template to understand, how the key should be.
for example:
while parsing private_key we make sure the structure of the key is as follows
-----BEGIN PRIVATE KEY-----\nPRIVATE_KEY\n-----END PRIVATE KEY-----\n"
We can have a template for how each key should be and based on that we can understand which keys are problematic and can inform the user without having to call the method from_service_account_file
No there is no need to try to validate the key format. This is completely unnecessary. The problem is how the errors are handled and reported.
@ZetaTwo the method that throws exception is part of google/ouath2
package and not gcp_scanner. I am having trouble understanding the scope of this solution. Where do we want to resolve this issue? Exception raised from from_service_account_file
method does provide some insight as to where the problem lies.
For example, if the private key is malformed, we get this error raise exceptions.MalformedError("No key could be detected.")
do we want to understand from the exception as to which key is related to the exception or is there some other way that i missing?
No it is enough to know which keyfile has an error. Currently we do something like this (pseudo-Python):
keys = []
for file in directory:
keys.append(from_service_account_file(file))
but we need to do something like this:
malformed_files = []
keys = []
for file in directory:
try:
keys.append(from_service_account_file(file))
except AppropriateException:
malformed_files.append(file)
if len(malformed_files) > 0:
log.error('Failed to parse keyfiles(s):')
for file in malformed_files:
log.error('...')
return False
...
@ZetaTwo I understand now. Thanks for the explanation. This is what i thought when you were talking about parsing all the keys.
By "keys", I thought you were talking about parsing individual keys inside inside a single json file.
"private_key_id": "KEY_ID", "private_key": "-----BEGIN PRIVATE KEY-----\nPRIVATE_KEY\n-----END PRIVATE KEY-----\n", "client_email": "SERVICE_ACCOUNT_EMAIL",
Now i understand that we want to parse all the keyfiles (keyfile1.json, keyfile2.json and so on).
I can work on the fix if you want.
Yes, it's enough to know which file caused the error at this point. Knowing exactly which key is a separate issue which could potentially be interesting as well but it is not as important.
Okey. I will start working on the fix. Could you please assign me this issue? @ZetaTwo
Great. We don't really assign issues though so just submit a PR when you have a solution.
@ZetaTwo I have created a pull request for the fix. Looking forward to feedback.
Describe the bug
If one of the keys contains some error, for example the json is invalid or the key is not a valid key, GCP Scanner crashes with a very unhelpful error message.
To Reproduce
Expected behavior
Either
Current behavior
GCP Scanner crashes with an error telling me that it failed to parse some key