SNI is not reliable, Use DNS instead. Potential DNS limitations.

zied-turki commented 1 year ago

When we talk about TLS ECH and SNI reliability, we most of the cases think about how to use DNS instead. In my understanding, enterprises believe that they can control TLS ECH flows as they have the full control of the user device and the corporate DNS:

DNS records namely DNS SVCB and HTTPS RRs can be filtered
TLS ECH can be deactivated from the browser

This is partially true. At least because of the following 2 use cases:

Unmanaged or BYOD user devices
Malicious activity / exploits

I would like here to talk about point 2. Let’s assume that an attacker somehow uses TLS ECH - Feature supported by any browser in the future- (which is supposed to be deactivated as per enterprise policy but can be bypassed somehow) and exfiltrate sensitive data. The Attacker doesn’t need to use DNS to get the encryption key for his destination as it can be preconfigured by himself.. If this happens:

The corp. DNS doesn’t see any TLS ECH related DNS requests
As per the current design, the web proxy is not able to decrypt and inspect these flows. The proxy may not even be able to securely block this kind of activities._

So I don’t think that the full control of corporate DNS is enough to address TLS ECH enterprise security implications.

I look forward to your thoughts.

Thank you Zied

roelfdutoit commented 1 year ago

For malware to effectively use ECH it would have to hide the C2 destination in a large anonymity set, which can only be provided effectively by a CDN. Reputable security vendors provide risk levels and categories for destinations, including for the potential public facing ECH destination of the malware C2 server. For those reasons my opinion is that malware will still use DNS for ECHConfig.

zied-turki commented 1 year ago

interesting ! I understand and agree that malware would need to hide the C2 destination in a large anonymity set. But can you please help me understand how the use of the DNS will help the malware to keep this anonymity? Sorry, but I did not get it. Thank you Roelof.

roelfdutoit commented 1 year ago

The malware would typically use a public CDN, which would frequently cycle the ECHConfig. The only reliable way for the malware to have an up-to-date config is to get it from DNS.

chris-wood commented 1 year ago

I would also add that malware -- by its very nature of being attacker controlled -- can choose names for servers that are totally meaningless or would otherwise not trigger inspection by middleboxes. The assumption that malware will continue to use "obviously bad" names seems invalid to me. Essentially, if your threat model takes malware into consideration, then it should also assume that a name-based approach to security is ineffective.

zied-turki commented 1 year ago

I get it, thank you both! Question (among others): let's assume that malware uses meaningful names and still uses DNS. What if the malware is able to use another DNS (not the corporate one) with encrypted DNS (DOH or any future encrypted DNS requests)? We have to be able to detect these DNS requests and block them as well. May be it is already done, but I think that we should carefully nuance the efficiency and reliability of the use of DNS and list all the relevant arguments to defend these ideas as you mentioned here for future debate.

chris-wood commented 1 year ago

let's assume that malware uses meaningful names and still uses DNS. What if the malware is able to use another DNS (not the corporate one) with encrypted DNS (DOH or any future encrypted DNS requests)? We have to be able to detect these DNS requests and block them as well.

This threat model doesn't make much sense to me. Let's assume that such malware did exist, and that you did have a way to block them connections that used this alternative (non-corporate) encrypted DNS resolver. The obvious thing for the malware to do would be to not use meaningful names, right? Or, more generally, why would we not assume the malware would just do something different that works around the name-based blocking mechanism that's in place?

At the end of the day, if you're trying to stop malware network connections, doing so via names is not effective.

zied-turki commented 1 year ago

Thank you Chris. I think I need to add further details to my question to make it clear. My assumptions as discussed above:

Malware would have to hide the C2 destination in a large anonymity set, which can only be provided effectively by a public CDN
Malware would typically use a public CDN, which would frequently cycle the ECHConfig
Malware would then use "meaningful" names The question was what if the malware did all of the above but use another DNS resolver rather than the Corporate one? Is that possible? I am not sure, but I am just trying to understand what would be the limitations of the use of DNS.

chris-wood commented 1 year ago

Malware would have to hide the C2 destination in a large anonymity set, which can only be provided effectively by a public CDN

Why does it need to use a CDN to hide the destination? Why can it not just use a garbage name "asdajsndkajsbhdasghdjasd.com" or, perhaps even better, don't include any name in SNI? Don't those options also hide the destination?

jordan2175 commented 1 year ago

There is pretty good detection and policy rules in corporate proxies that detect garbage names, randomly generated names, and typo-squatting. Applications and sites that use these have a higher risk profile. It does not mean that they are for sure malware, but they are less normal and higher risk. So organizations tend to block based on their appetite for risk. Cybersecurity is about risk reduction through prevention, mitigation, or even remediation. The problems that I think people here are trying to capture is how do organizations that have regulatory and statutory requirements to block certain traffic do that? Like it or not, it is a thing with real financial implications.

chris-wood commented 1 year ago

The problems that I think people here are trying to capture is how do organizations that have regulatory and statutory requirements to block certain traffic do that? Like it or not, it is a thing with real financial implications.

Yeah, I totally get this, and can empathize with the requirement. What I'm suggesting, however, is that this requirement to block certain traffic is not something that can be technically implemented with any sort of guarantee if the threat model includes malware. So this makes me wonder: what is the actual requirement, and what is the threat model in which that requirement is to be satisfied?

zied-turki commented 1 year ago

Yes. This is why it was mentioned above that malware still need to use DNS. This helps to guarantee lower risk scores and obfuscate C2 destination. I was not talking about specific requirements; I was just asking about DNS filtering effectiveness against potential malwares leveraging TLS ECH. We can also discuss enterprise requirements, but I think it is better to create a new issue for that.

chris-wood commented 1 year ago

Sorry, @zied-turki, but I'm not following you. To me, this issue doesn't make much sense if one considers malware in scope, so I'm not sure what you're trying to accomplish here. Can we take a step back and look at the actual requirement that's motivating this issue?

zied-turki commented 1 year ago

No worries Chris, you don’t have to be sorry 😉 The question was discussed in the last call and participants suggested to document it here (hence the creation of this issue). May be I’m mixing up different issues. Talking about “filtering” and the threat model described here is confusing as well. I will think about it and we can discuss it orally in the upcoming call to decide together what action to take with regards this issue. Thank you.

chris-wood commented 1 year ago

When is the next call? Would it be helpful if I joined?

taddhar commented 1 year ago

Hi all, first thank you to all of you for your contributions on this issue.

As shared last week and in calls, I am in ITU-T TSAG which is finishing now and I set expectations I would not be able to do anything else this week.

As all is settling down I will send an invite shortly for a call at 5pm CET on Wednesday 7pm

In addition I hope to progress my own contributions to this one.

chris-wood commented 1 year ago

Sounds good. Thanks @taddhar =)

PascalPaisant commented 1 year ago

Just a remark about Chris comment that "malware [...] can choose names that are totally meaningless or would otherwise not trigger inspection by middleboxes" By default, DLP solutions trigger inspection of all outbound traffic, except for selectively excluded domains. In others words, inspection is not triggered by an "obviously bad" name, but is the normal case. As fas as DLP is concerned, the objective is not to block traffic based on target destination name (or by any other means), but to be able to decrypt not filtered outboud traffic to inspect it.

jordan2175 commented 1 year ago

It is also important to note that middle boxes aka a Proxy is much more complex than just using simple things in isolation. Understanding malware activity or threat actor / intrusion set activity requires a set of indicators and their interactions. But the common use case for this is actually the reverse. I want to inspect everything other than Banking traffic or Medical traffic to known and well established banking and medical sites.

chris-wood commented 1 year ago

@PascalPaisant @jordan2175 if the default is to inspect, then I fail to see how the lack of visibility into the SNI does anything other than harm performance (you have to inspect all connections as opposed to only a select few). Is that what this boils down to, or is there some other functionality or feature that's affected here?

PascalPaisant commented 1 year ago

To inspect the https payload, we first need to decrypt it. As I wrote in issue#61, my understanding is that : "all data transmitted in an https session is encrypted using a shared secret exclusively known by the two ends of this session. [..] Since only the two ends of an https session can decrypt these data, the middlebox needs to create two back-to-back TLS sessions: one from the client to itself and the second from itself to the target destination, with which it will initiate a TLS handshake. This means the middlebox must, by any means, know this final target destination, which is identified in the SNI." So, the lack of SNI visibility would affect the interception / inspection functionality.

Btw, I'd rather talk about inspection functionality (which is really what I care about for my enterprise security), than about middlebox which just an implementation option and doesn't say a lot about what the middlebox does

PS: I leave for vacations for a couple of weeks today, so don't expect much activity from me on this issue during this time

chris-wood commented 1 year ago

Well, again, if you have the ability to decrypt the traffic as a middlebox, then use of ECH is irrelevant and has no effect on the overall system beyond perhaps regressing performance.

jordan2175 commented 1 year ago

I think we need to focus on outcomes rather than logistics of doing x or y. I agree with @PascalPaisant that middle boxes are just one way this might be done.

There is regulatory requirements that say things like "Organizations MUST inspect ingress and egress traffic looking for X, Y, and Z or prevent traffic M, N, and O but MUST not inspect traffic of type A, B, and C that is destined to the following sectors or verticals H, J, K."

So just saying use DNS or just use SNI or just use the server certificate is not really useful. We need to make sure that things can work with regulation otherwise we further increase the balkanization of the Internet or worse give fuel to the idea of adopting NewIP.

We need to protect people's privacy and have strong protections for end users that do not know better, but we still need to allow regulated industries to do what they need.

chris-wood commented 1 year ago

There is regulatory requirements that say things like "Organizations MUST inspect ingress and egress traffic looking for X, Y, and Z or prevent traffic M, N, and O but MUST not inspect traffic of type A, B, and C that is destined to the following sectors or verticals H, J, K."

This is helpful context. Can you please provide a citation to such requirements?

taddhar commented 1 year ago

Hi all, am back from holidays and was in a 21 hours long day yesterday at my return.

@chris-wood I am not sure there is a citation for that but we can research. In short this is pure application of e.g. GDPR:

you must protect the data, so you must prevent the bad thing to come in and make sure the good thing is not leaving (which imposes content analysis for malware in igress and data loss prevention in egress)
at the same time you must minimise what you inspect, so you must do selectively chose what you inspect

This is enforced on banks here for example by financial auditor (here the European Central Bank).

But in general each customer will have a compliancy team that will look at ALL the regulations they must apply and will logically deduce what they need to do with all of them to give requirements to their security team. So for example look at GDPR, DORA, CRA, etc for EU + look at US Privacy act (California, etc) + other regions, and then distill down what they must implement, why, liabilities if not doing it, etc. etc. and finally produce the requirements to their security team.

Therefore the requirement for selective decrypt and I do not know a single customer in my realm not doing it (am talking the 1000 top customers here).

So bottom line I doubt the regulation will put any technical mean in hard in their text but this is how it will translate in practice.

jordan2175 commented 1 year ago

Yeah the auditors often ask for this in regards to various things like GDPR, Banking, PCI-DSS, School child protections, and even in Government / Critical Infrastructure. It has been a while since I had to gather requirements from regulations in these sectors. If I get time I will go hunt for the actual text.

chris-wood commented 1 year ago

I think what I struggle with here is the jump from "need to protect privacy" to "you need to intercept and decrypt traffic to inspect it." Are regulations written in a way that requires interception, or is it rather the case that regulation does not specify the enforcement mechanism and it just so happens that many interpret these regulations as "we need to decrypt to meaningfully enforce"?

taddhar commented 1 year ago

I understand your question and I am not a compliancy professional so I cannot provide an authoritative answer. Those I know are right now mostly in holidays. My suspicion is that regulations will not make statements like you must implement data loss prevention (and therefore do selective decrypt). They will and should keep a higher level requirement language. DLP and Selective Decrypt are probably very low implementation level language that I would not imagine any regulation language would or even should carry.

Just to play the exercise, I just had a look at GDPR for sake of an example and played the game to look at a few articles. If you check article 32 (especially clause 2), 33 and 34 and 5, then an interpretation of how to implement it will lead you to Data Loss Prevention and as you want to minimize it, you then immediately call selective decrypt and you have a good solution that meets those requirements.

What I don't know however is how the auditors are interpreting this, e.g. the European Central Bank for the FSI world and if they are imposing/enforcing stronger and more explicit requirements to customers but what is for sure is that ALL my FSI customers implement DLP AND with selective decrypt without exception. Given the few things I know from how hard have been some recent (6 months long) audits on some banks or given the Whatsapp story in the US against a dozen of FSI, I do not think that anyone will take this as a theoretical exercise.

But happy to discuss and happy to be turned wrong or educated / augmented by any party.

And maybe I get a more authoritative answer soon which I will be happy to share.

Please note I only took 1 Regulation for 1 aspect (Data Loss Prevention) with 1 additional reqiurement (minimization) that leads to DLP and Selective Decrypt as a solution to meet this requirement. There are dozens if not hundreds of regulations that compliancy team must address and they probably must find the best common denominator to all of them at the right cost effective solution vs risk levels of their organizations ... and I appreciate this is a job and not mine.

Hope this helps a little bit

chris-wood commented 1 year ago

@taddhar what's problematic here is that (1) DLP and minimization together don't make any sense to me, i.e., I don't see how you can have both at the same time, and (2) this is just one interpretation of a set of poorly defined requirements. So using this as an argument against things like ECH and whatnot doesn't make much sense to me. I really would like to see this flushed out in more detail, and would like to see a technical reason -- grounded in real requirements -- as to why ECH is problematic.

taddhar commented 1 year ago

@chris-wood Am doing my best to explain and accept my explanation might not be good enough.

Yet, compliancy teams in organizations that are doing DLP do not want to do DLP on everything because they want to be as less intrusive as possible. So they will only implement use cases such as: if the policy says that data uploaded to x.com must be inspected, then they will only look at egress x.com flows. To do that you need to implement a selective decrypt mechanism. That is what I wanted to say for your point (1)

Now for your point (2) I cannot put a judgement on regulators defining what they want to define, is it poor or not poor, that is what it is but what is for sure is that Data Privacy Officers (DPOs) to start with, with the help of CISOs and their teams must address those requirements as they are and show compliancy to their auditors.

So of course there is room for interpretation (I don't remember how many permutations you can have on how to interpret GDPR overall beyond our discussion here but it is in 100s and requires an army of lawyers, per country, per customer, per ... etc.) so please do not shoot the messenger here.

Now so DPOs with the help of their CISO teams must find an appropriate solution meeting other requirements, e.g. cost, risk, fit to other regulations, geographic coverage and what not.

This is a job, at the end of this job large portions of enterprise organizations recognized that Data Loss Prevention is the solution that fits their issues. So my point is that the initial requirements may be 'poor' to you but they are absolutely 'real' in the industry.

You can kill the messenger if you want here, but that will not buy you anything on the point that when ECH will happen, no-one can calculate the destination of the flow, so selective won't apply and so DLP won't work.

If you have a magic solution to help change this, please share. I don't have.

BTW, like earlier @jordan2175 made the point we could discuss the igress flow too.

Happy to discuss more e.g. tonight

PascalPaisant commented 1 year ago

I’d like also to insist on the fact that this subject is not only a compliance issue, and btw I don’t want to open a debate about usefulness of security regulations: they are what they are but an enterprise has no other choice than to comply with it: « dura lex sed lex ».

If we forget compliance for a while, a company has assets. Some of them are considered vital, because their loss may cause the death of the company (or at least extremely harmful damages). For some entreprises (such as systemic banks), some assets are even declared as vital for the country. As a consequence, enterprises must implement policies to ensure that sensitive assets are not leaked to unauthorized parties.

That’s why there is a need for a function to selectively inspect the myriad of bytes which daily flow outside of the enterprise. The guiding principle is that the communications with some destinations are blocked, and that for the others not all type of data can’t be sent. Nobody would like to see his own credit card number published in the dark web, but it might be perfectly legitimate to send a credit card number in an authorized commercial exchange.

As of today, this selective inspect functionality is provided by DLP running in so-called middleboxes. My understanding is that the implementation of ECH will prevent current solutions to make the job. I’m fine if this function can be offered by another solution, provided it has the same level of security at the same cost (including migration costs), but my enterprise needs this functionality.

Also, it must be reminded that an enterprise is not like the « open world ». Employees are using communication systems provided by the enterprise to make the job they have been given. There are some subtleties in the internal company rules (and also in some countries law) which frame under what conditions and with which procedures data uploaded by an employee to the web can be inspected. But at the end of the day, it is our duty to check no sensitive data had been leaked either by carelessness or by bad will.

echdeploy / draft-ech-deployment-considerations

SNI is not reliable, Use DNS instead. Potential DNS limitations. #66