Closed sivadeilra closed 9 years ago
Nominating. Regardless of whether this technically violates Rust's memory safety guarantees, it's extremely surprising and has major security impact.
It seems this was already noticed and the functions are suggested for deletion when std::io
gets overhauled for 1.0: https://github.com/aturon/rfcs/blob/io-os-reform/text/0000-io-os-reform.md#removed-methods
As a side note this feels like what unsafe traits would be useful for.
This was discussed as part of the IO reform RFC process, and it is believed not to lead to undefined behavior, at least if we follow a certain pattern of intrinsics use. Reading an uninitialized but alloced block of u8
data is type and memory safe.
There doesn't seem to be widespread agreement elsewhere (on IRC, etc) as to how serious a security risk this poses. However, note that we can close this potential hole easily by pre-zeroing the memory in these convenience functions, without causing any API breakage.
It would be helpful to see a more detailed example of an exploit based on the current behavior.
I'm frankly astonished. There is no way my team will accept Rust with this kind of information leakage within its "safe" subset.
How can you make any claim about type safety and security, with such a gaping hole?
On Thu, Feb 12, 2015 at 2:07 PM, Aaron Turon notifications@github.com wrote:
This was discussed as part of the IO reform RFC process, and it is believed not to lead to undefined behavior, at least if we follow a certain pattern of intrinsics use. Reading an uninitialized but alloced block of u8 data is type and memory safe.
There doesn't seem to be widespread agreement elsewhere (on IRC, etc) as to how serious a security risk this poses. However, note that we can close this potential hole easily by pre-zeroing the memory in these convenience functions, without causing any API breakage.
It would be helpful to see a more detailed example of an exploit based on the current behavior.
— Reply to this email directly or view it on GitHub https://github.com/rust-lang/rust/issues/20314#issuecomment-74163554.
@sivadeilra
First, I'm sorry if I wasn't clear, but no decision has been made here. I was asking for clarification.
I was trying to clarify first of all that there is not a type or memory safety hole here, see this comment thread. If you believe this can lead to a type or memory safety violation, can you please spell out the details? (cc @mahkoh)
Second, I was hoping you could spell out in more concrete detail an exploit based on this behavior alone. It would require a Reader
implementation to read from the buffer it's supposed to write into, afaict.
As I said in my comment, we always have the option of zeroing the memory in these convenience methods, which doesn't require any API change.
I'm basically with @sivadeilra here. This needs to be fixed for 1.0.
The exploit seems clear. The memory you read in read
is uninitialized, fresh from the allocator. It will contain whatever was there last — passwords, encryption keys (yeah, you should zero these on drop, but...), addresses of internal structures useful for defeating ASLR, etc.
You may not be worried about malicious read
impls but even an inept one may leak information. Exploits for systems like kernels or sandboxed browers are very complex. They often combine information leaks from a number of "minor" bugs.
... not lead to undefined behavior, at least if we follow a certain pattern of intrinsics use. Reading an uninitialized but alloced block of
u8
data is type and memory safe.
Sure, it's not UB. I'll even accept that it's not a "type and memory safety" issue. But this is a major lurking security risk in a language that otherwise goes out of its way to prevent you from reading uninitialized memory.
@kmcallister
The exploit seems clear. The memory you read in
read
is uninitialized, fresh from the allocator. It will contain whatever was there last — passwords, encryption keys (yeah, you should zero these on drop, but...), addresses of internal structures useful for defeating ASLR, etc.
But a read implementation doesn't read from the buffer, it writes into it. It would have to be very inept indeed to read from the buffer.
But a read implementation doesn't read from the buffer, it writes into it. It would have to be very inept indeed to read from the buffer.
No, it would just have to be a large software system with many components that interact in unanticipated ways. A read
implementation could transitively invoke thousands of lines of code...
There's no way I should have to write a "realistic" program and exploit to get people to take this seriously. But I'll do it if that's what it takes.
Any read of uninitialized memory in the safe subset of Rust seems 100% not okay. It doesn't matter whether it's technically a "memory safety" issue.
It would have to be very inept indeed to read from the buffer.
Argument by lack of imagination :)
"Only two things are infinite, the universe and human stupidity, and I'm not sure about the former."
@kmcallister
There's no way I should have to write a "realistic" program and exploit to get people to take this seriously. But I'll do it if that's what it takes.
I think you and @sivadeilra are reading way too much into my earlier comment: I'm trying to gain clarity here about the issues in concrete terms, because the original claim had to do with memory unsafety/UB, which turned out to be incorrect. I do take this seriously! Which is why I'm trying to understand the details.
Any read of uninitialized memory in the safe subset of Rust seems 100% not okay. It doesn't matter whether it's technically a "memory safety" issue.
This seems like the crux of the issue. There does not seem to be widespread agreement about this point, and I think we need to make a global decision about it.
cc @nikomatsakis @pcwalton
Or by not writing at all, returning a non-zero byte count, and then calling
push()
and seeing what memory was returned.
Yeah, screwing up the byte count seems more likely to happen than deliberately reading in the read
impl, and just as bad or worse.
@kmcallister Thanks, that's what I was looking for!
Sorry to pile on you @aturon, I know you're just fact-finding. I do have a pretty strong reaction to the "surely nobody would screw up that bad" logic, which is at odds with Rust's philosophy.
OK, after discussion here and elsewhere, I propose we write a short RFC that clearly establishes the following policy, which goes beyond ensuring lack of UB:
Safe APIs in Rust must never expose uninitialized memory, even in cases that do not lead to UB.
@kmcallister Would you be interested in working together on such an RFC?
I'll be honest, I was not following the full details of this discussion before. In particular, I mistakenly thought the focus was on the read
method, which takes in a buffer and writes into it, rather than the read_to_end
method. I talked this over a bit with @aturon (and followed @kmcallister's comments on IRC) and my feeling right now is that I agree with @kmcallister, we should uphold the principle that safe APIs do not expose uninitialized bytes. It seems to be a less important guarantee than undefined behavior, but important enough not to sacrifice lightly. Moreover, this problem only even comes up for a narrow range of APIs involving byte buffers; as a consequence those APIs don't feel especially rusty to me. And finally in this particular case, read_to_end
is merely a convenience function, so if this becomes a perf bottleneck somewhere, it is easily overcome.
I am somewhat conflicted here, but I am happy to go along with this consensus.
I've written an RFC to update our policy about safe code to cover this case.
To me, this hole does constitute a type safety hole. "Type safety" is not just the ability to prevent data structures from being modified outside of your language. It is the ability to preserve the invariants of the language. Put as precisely as I can, "type safety" is the property of a language, in which the language defines its type system / operations / invariants, and that the language controls the mapping from those language semantics to the underlying machine representation. That is why, to me, being able to read the machine state (outside of the language) is just as bad as being able to modify the machine state.
I believe many people have the perspective that conserving "type safety" means only preventing modification of machine state outside of language. I think the last 15 years or so of experience with security, in operating systems, browsers, and everything else, shows that being able to access information (that you're not supposed to access) can be just as dangerous as being able to modify information.
Keegan's point about the composition of large, complex systems is very important. Violations of invariants that may seem minor may become major, when combined with other seemingly-minor exploits, or when an attacker manipulates a "minor" exploit (such as reading dirty memory) in order to read a very juicy piece of information.
It is not important that this involves I/O. What's important is trusted system code (code within an "unsafe" block) is relying on untrusted user code (any implementation of the Reader trait) in order to maintain the invariant of type safety.
It may be possible to have some cake and eat some. It may be possible to have Rust provide efficient unsafe implementations of Reader, which cannot be abused by user code (i.e. have no extension points), while at the same time providing less-efficient extensible implementations which guarantee that the buffer is zeroed.
I would be extremely cautious about trading away a profound safety invariant for a small, quantitative gain in performance.
On Fri, Feb 13, 2015 at 10:18 AM, Keegan McAllister < notifications@github.com> wrote:
But a read implementation doesn't read from the buffer, it writes into it. It would have to be very inept indeed to read from the buffer.
No, it would just have to be a large software system with many components that interact in unanticipated ways. A read implementation could transitively invoke thousands of lines of code...
— Reply to this email directly or view it on GitHub https://github.com/rust-lang/rust/issues/20314#issuecomment-74299941.
I think it's worth noting that this behaviour can be replicated in "user-space" by just reusing a fixed length buffer several times without zeroing in between (e.g. there was a recent blog post that described itself as demonstrating heartbleed in Rust by doing that sort of thing). Of course, it's definitely not as insidious as using uninit memory, since its relatively obvious from the source, while implicitly reusing an old allocation is very subtle and non-local.
\src\libstd\io\mod.rs
contains two methods that allow attacking code to read memory that it should otherwise not have access to. The methods areReader.push()
andReader.push_at_least()
. An attacker could write (or exploit) an implementation ofReader
, by implementing aread()
method that reads from the given buffer, rather than writing to it. Or by not writing at all, returning a non-zero byte count, and then callingpush()
and seeing what memory was returned.The
push()
andpush_at_least()
methods should probably just be deleted entirely. The support functionslice_vec_capacity()
could also be deleted.