Closed obi1kenobi closed 6 years ago
CC
This issue has been assigned CVE-2017-1000246.
Hi there, thanks for the interesting report and congrats on getting your CVE.
Although I agree with you in principle and we are going to fix this shortly, I'm trying to look into the applicability of an attack based on your report and I'd appreciate your feedback.
. For example, if the IV is reused with the same key in AES-CTR mode,
We do not use or support CTR mode https://github.com/rohe/pysaml2/blob/4375361939e942c4dd666d3ca4e1159858404bc4/src/saml2/aes.py#L11-L15
Even if the attacked could decrypt the data, the only thing we encrypt is the username of the authenticated user before we store that to a cookie. This username is not secret ( in fact it would probably be included in a possibly non encrypted SAML Assertion a few steps down the line ) and I would be more concerned with the attacks against the integrity(authenticity) of the value rather than its confidentiality. **
For CBC mode, since the username vocabulary is limited, I would argue that it would be difficult to mount a chosen plaintext attack ( plaintext = username) and thus, even though the IV reuse is bad practice, I don't see how anyone using this code would be vulnerable to any attacks.
What do you think @obi1kenobi ?
** We don't do a pretty good job with that too admittedly, but I will be opening a separate issue for that.
Hi @jkakavas, thanks for the reply.
The issue exists inside the class that handles AES encryption generically for the entire library, and not just for encrypting usernames. I'd argue that even if usernames are indeed the only thing being encrypted right now, it isn't safe to assume that will continue to be the case in the future.
I agree that attacks on the message authenticity are concerning, as you point out.
I'd suggest three things:
AESCipher
class, and instead randomly generate the IV for each encryption.The IV reuse problem must be fixed before switching to AES-GCM though, since AES-GCM has a similar construction to AES-CTR and fails in the same way when the IV is reused. Also, you might consider AES-GCM-SIV, which is resistant to reused IVs but is slightly more expensive in terms of computation time.
The issue exists inside the class that handles AES encryption generically for the entire library, and not just for encrypting usernames. I'd argue that even if usernames are indeed the only thing being encrypted right now, it isn't safe to assume that will continue to be the case in the future.
Yes, we only use it to encyrpt the username in the cookie, so it is not currently something that can e exploited. I agree though that it should be changed so we will not unknowingly inject any vulnerabilities in future versions.
I will make a pull request to address this. Thanks again for reporting it.
Sounds great -- thank you for looking into fixing this.
@jkakavas Is there a PR addressing this?
@kylegibson No, not yet. I can work on it soon-ish but if you have something you can contribute, please do by all means
Was this issue ever resolved? My team is working on resolving all the CVEs, and we're not sure if simply upgrading the version does it, or what commit fixes it if any, or if we have a solid reason for ignoring it.
@dcripplinger as far as I can tell, the issue appears to not be resolved, at least not on the master
branch. I believe it is still a problem that should be addressed, and it should be addressable by simply generating a new IV for every encryption operation.
@jkakavas :wave: Hello! I'm on the GitHub team responsible for sending security alerts for vulnerable versions of Python libraries. It appears that the issue described here (assigned CVE-2017-1000246) affects all versions as of its writing (<= 4.4.0
). Since its writing, version 4.5.0 was released. Can you share whether version 4.5.0 fixes this issue?
As it stands now, we plan to alert pysaml2
users today on vulnerable versions <= 4.5.0
(all currently released versions), providing no remediation steps as no fixed version has yet been identified. Can you please confirm that this information is accurate? Thank you! ๐
I am not a cryptographer so I cannot easily decide whether using AES-GCM
or AES-GCM-SIV
(as proposed by @obi1kenobi ) is the right thing.
Instead of relying on a fixed, randomly-generated IV, it would be better to randomly-generate a new IV for every encryption operation
However, to generate a new random IV for every encryption operation, one only needs not provide an IV in the first place. I will work on @obi1kenobi proposals immediately.
we plan to alert pysaml2 users today
@laserlemon sending security alerts is really helpful and mindful to do. It is great to have a team like that and thanks for doing that ๐ A security alert though, usually panics people, and as such it should be of actual value; it should be an actual issue. Even though there is a CVE assigned, there is nothing of real interest here as @jkakavas already explained earlier. I wouldn't like to wake up with a security alert that ends up not being an actual issue.
Ofcourse, this is no excuse by the pysaml2
-team not having acted upon this for so long ๐. I am here to fix this.
[offtopic] ๐ฐ
I would be really interested to that list. I would like to be able to contact users of pysaml2
to understand their use cases and alert for breaking changes. Is this something the pysaml2
-team can have access to? Can we generate that list somehow?
Again, thanks for doing this :octocat: I will prepare a PR soon.
Pull Request #519 created to address this.
I agree with @c00kiemon5ter's comments. Although this is a valid finding, I can't see how this can be exploited in any way (my thoughts/analysis on above) so I'm not sure the notification is helpful in that context. At least maybe since there is no 4.5 released, the notification could at least point to this issue in order for anyone interested to see why this is considered not exploitable.
@c00kiemon5ter maybe information on this can go out to the pysaml mailing lists so that folks are not taken by surprise by a potential GH notification ?
I would be really interested to that list. I would like to be able to contact users of pysaml2 to understand their use cases and alert for breaking changes. Is this something the pysaml2-team can have access to? Can we generate that list somehow?
-1 for me. I would rather invite people to voluntarily join the mailing list by pointing it out in the Readme, than obtaining such a list via Github ( which I don't think they're in a place to provide eitherway ) and sending out unsolicited emails. ( We probably shouldn't also, GDPR being one of the reasons at least ) WDYT ?
I am not planning to start a campaign ๐I am mostly interested in which parts of the library are being used, and which can be replaced or dropped without making the users inconvenient. It is mostly about us peeking into their projects and learning about their use cases, rather than contacting them. If we know there are users of a part of the library that is being changed (a breaking change) we can work on a migration path or propose a complementary tool.
Thank you everybody for your comments! After hearing your feedback, we decided not to send notifications to repos that currently use a version of pysaml2
that's "vulnerable" to this specific issue. Please note though, should someone push a new requirements.txt
that targets a vulnerable pysaml2
version, they will receive a notification at that time.
Also, in regards to your question:
I would like to be able to contact users of pysaml2 to understand their use cases and alert for breaking changes. Is this something the pysaml2-team can have access to? Can we generate that list somehow?
Yes, you can see the public repos that depend on pysaml here on your Dependency Graph page. We are not able to provide additional contact information for users of pysaml2
.
Thanks for providing such helpful feedback and for maintaining pysaml2
! :clap:
The library we use to implement AESCipher
is cryptography
. I've looked into AES
, the different modes and how they are supported. The recommended practice by cryptography
is to use Fernet
(spec).
Fernet
guarantees that a message encrypted using it cannot be manipulated or read without the key.Fernet
is an implementation of symmetric (also known as โsecret keyโ) authenticated cryptography.
The interface is also very simple ๐Fernet
provides generate_key
, encrypt
and decrypt
. Behind the scenes:
Implementation
Fernet
is built on top of a number of standard cryptographic primitives. Specifically it uses:
AES
inCBC
mode with a 128-bit key for encryption; usingPKCS7
padding.HMAC
usingSHA256
for authentication. Initialization vectors are generated using os.urandom().
I think Fernet
fills our use case and it is recommended practice. It does not use GCM
or GCM-SIV
(both of which are supported by cryptography
, but introduce the authentication tag that needs to be kept available.)
I think I will go ahead and wrap Fernet
to replace AESCipher
.
Thank you everybody for your comments! After hearing your feedback, we decided not to send notifications to repos that currently use a version of pysaml2 that's "vulnerable" to this specific issue. Please note though, should someone push a new requirements.txt that targets a vulnerable pysaml2 version, they will receive a notification at that time.
I think this is reasonable ๐ Thanks!
Yes, you can see the public repos that depend on pysaml here on your Dependency Graph page. We are not able to provide additional contact information for users of pysaml2.
That's great, thanks for pointing us there - I did not know about the dependency graph.
Cheers :octocat:
@laserlemon Just as a quick FYI, we did get a notification about this vulnerability on one of our repos (perhaps because it's private so the dependency graph features had to be enabled?). Not a big deal as we were already tracking this, but wanted to let you know in case that's not what was anticipated per your note above.
I think the decision to show the warning on GitHub is wrong. This issue doesn't actually cause any issues anyone would care about. The big warning is just going to scare people for no good reason. It's not like using pysaml will mean someone will be able to hack your site.
The Server class randomly generates a fixed 16-byte initialization vector (IV) for the purpose of encrypting data. Then, via the UsernamePasswordMako class, that fixed IV makes its way to the
AESCipher
class, where it is consistently reused for encrypting data.Initialization vector reuse like this is a security concern, since it leaks information about the encrypted data to attackers, regardless of the encryption mode used. For example, if the IV is reused with the same key in AES-CTR mode, the attacker will very likely be able to entirely decrypt the encrypted data: https://crypto.stackexchange.com/questions/2991/why-must-iv-key-pairs-not-be-reused-in-ctr-mode
Instead of relying on a fixed, randomly-generated IV, it would be better to randomly-generate a new IV for every encryption operation. Here are a couple of links that have more information on why that is the preferred approach: