Open r33drichards opened 1 month ago
Shlink does need to be able to display API keys, for example in the api-key:list
console command, and I think it was needed in some other place as well.
Also, I think the raw key itself is used to identify API keys in a few places, so hashing them would affect this kind of mechanism as well. I would need to verify this though.
So I agree with you they should be hashed, and I would probably implement it differently today, but it's not easy to change now for backwards compatibility reasons.
I will plan on changing this, but it's going to take a while and needs to be tested carefully.
Among other things it will need the name to be mandatory, so that you have later a way to identify API keys in case you need to revoke, restrict, etc. In existing installations most API keys don't have a name ATM.
These are I few things I can think of, that would be needed in order to allow this change:
In addition to this, I need to check what's the performance impact on verifying the API key hash on every request. I guess it would be negligible, but something worth checking.
But overall, this needs to be done carefully, as once keys have been hashed there's no way back, so breaking users installations would be bad. They would need to generate new API keys.
On the other hand, one reason to hash user passwords is because they are provided by them, and that ensures nobody with access to the database info (legitimately or illegitimately), can know what the user originally provided, which they frequently reuse in other services.
API keys are random and generated by Shlink. If someone can access your database, you are basically screwed already, as they can do anything they would through the API. They don't really need the API key anymore, and the value itself does not present extra risks as it does with user-provided passwords.
Thanks for taking the time to reply!
Shlink does need to be able to display API keys, for example in the api-key:list console command, and I think it was needed in some other place as well.
Also, I think the raw key itself is used to identify API keys in a few places, so hashing them would affect this kind of mechanism as well. I would need to verify this though.
So I agree with you they should be hashed, and I would probably implement it differently today, but it's not easy to change now for backwards compatibility reasons.
ahh yea that makes sense, seems like encrypting the token and storing the encrypted value and decrypting it could be a viable option then.
Not sure if you need a name to revoke and delete a token, surely just selecting the token by its database id is enough? All of those things are not strictly necessary to implement column level encryption for the value of the tokens, but I agree especially on the ability to revoke tokens being an important and useful thing to add.
API keys are random and generated by Shlink. If someone can access your database, you are basically screwed already, as they can do anything they would through the API. They don't really need the API key anymore, and the value itself does not present extra risks as it does with user-provided passwords.
You are absolutely correct in this situation! If I was in a scenario where I had a database leak and I know about it, I'd have to rotate every secret key b/c they would all be compromised if I wanted to continue using the service as is as well. But if you consider the situation where my database is compromised and I DON'T know that its compromised, I have no way to prevent someone from using an api key that they would have obtained from the stolen database to continue to leak data. If someone obtained the db dump through a one time exploit or a purchase on the black market, they could continue to export data. If the values are encrypted or hashed, I could at least have forward secrecy with my data in the event of a leak.
Encrypting tokens can also help prevent against internal threats such as curious administrators or contractors who might have database access but shouldn't see raw API keys.
So yea in summary I agree that hashing is probably not an option for shlink, but having something that sits in front of the data layer that encrypts and decrypts that value so that its encrypted at rest would be a security enhancement.
Oh, I didn't mean to disregard hashing as an option, in fact I think it would be preferable if I had to choose.
I just wanted to a) provide more context, and b) explain the difference between user-provided and system-generated credentials.
One tricky thing I found is that the API key is currently checked via SELECT ... WHERE key={provided_key} LIMIT 1
. If the keys are hashed in the database, there's no longer a way to do a single SELECT via a non-hashed key.
I've been investigating, and since API keys are system-generated and random, in most places it is considered to be secure enough to use a non-salted hash. Since that produces always the same result for a specific key, it would allow to continue doing a single SELECT, but with the hash instead of the plain key.
A SHA256 is secure enough and quite fast so that it should not add any overhead to API requests.
Shlink version
4.2.0
PHP version
8.3
How do you serve Shlink
Self-hosted Apache
Database engine
PostgreSQL
Database version
16.3
Current behavior
When I look at my api_keys table for shlink, i can see the api key present, and it is not hashed
Expected behavior
Since the exact value of this api key is not needed, only needing to be checked against a supplied one, it should be hashed like a password, to prevent accidental leaks https://cheatsheetseries.owasp.org/cheatsheets/Password_Storage_Cheat_Sheet.html#background
Minimum steps to reproduce