rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
96.61k stars 12.48k forks source link

Document minimum size for `usize` and `isize` #48593

Open clarfonthey opened 6 years ago

clarfonthey commented 6 years ago

Right now, usize and isize are "guaranteed" to be at least a byte long, but nothing more. It seems unlikely that Rust will support 8-bit targets in the future, but this is what the TryFrom implementations indicate.

A lot of people assume that usize will be at least 32 bits but this is not true for all platforms. I think that the docs should be clarified to make people more aware of this behaviour.

est31 commented 6 years ago

I don't think that platforms where there are only 256 addressable bytes in the memory are really a domain where you can do any meaningful programming from today's point of view, in Rust or in any other language. So we could assume 16 bits at minimum I'd say.

steveklabnik commented 6 years ago

It seems unlikely that Rust will support 8-bit targets in the future

Doesn't the new AVR stuff have 8 bit targets, as well as 16 bit ones? I'm not 100% sure.

est31 commented 6 years ago

On 8 bit targets the pointer size is not 8 bits, but 16 bits. As I've said above, nobody wants only 256 bytes of addressable memory :). The only platforms with 8 bit pointers are ones with memory models different from the C one, aka ones with distinct "memory types", each with its own pointer size. E.g. this one. I'm no embedded expert but to my knowledge these platforms are totally irrelevant today.

The best argument to require a minimum size of 16 bits for usize/isize is that C does the same (SIZE_MAX must be at least 65535).

clarfonthey commented 6 years ago

I agree, and I think that From<u8> and From<u16> should be implemented for usize to help assert this. Also, sixteen bit targets should be mentioned regardless.

durka commented 6 years ago

This was discussed at rust-lang/rfcs#1748.

steveklabnik commented 6 years ago

On 8 bit targets the pointer size is not 8 bits, but 16 bits. As I've said above, nobody wants only 256 bytes of addressable memory :).

That's what I get for checking my email late, heh. Duh, thanks.

Yeah, this seems like a major policy decision. @rust-lang/lang, does this kind of thing require an RFC?

whitequark commented 6 years ago

E.g. this one. I'm no embedded expert but to my knowledge these platforms are totally irrelevant today.

This is incorrect. For example, I've implemented a board support package for CY7C68013A, an incredibly popular USB 2.0 device controller, just a month ago, and that uses an extended 8051 (Dallas' clone, to be specific). In general, there's still a lot of 8051 cores out there and they're not going anywhere.

That said, I don't think Rust can or should support these platforms, simply because you need a disproportionate amount of language extensions (e.g. for 8051 it's first-class bit variables) to make them actually usable, and this cost should not be paid by rustc.

joshtriplett commented 6 years ago

While I don't think we should assume a maximum size for usize/isize, I think it's reasonable to assume that they're at least 16 bits on all supported platforms.

In practice, they're likely to be 32-bits on the vast majority of platforms, but personally I've occasionally wanted to run Rust on a 16-bit platform if LLVM could manage it.

clarfonthey commented 6 years ago

I think that assuming a maximum of 64 bits is very reasonable; that's 16 EiB. I can't imagine any machine wanting more than that in addressable memory.

petrochenkov commented 6 years ago

I'd rather not make any guarantees not statically backed up by the portability lint when it gets implemented. Then you can assume u32 <= usize <= u64 in the default configuration and whatever you specify by yourself otherwise.

hanna-kruppe commented 6 years ago

@clarcharr This sort of reasoning has historically not held up well 😜 I don't think anyone claims to know when 128 bit address spaces will crop up and what such a machine will look like, but it seems quite likely that someone will find a use for such a large address space sooner or later -- either by actually building a machine with that much memory (probably not all DRAM, but you can map other storage into virtual memory as well), or by using the extra bits for some other purpose. As a point of reference, RISC-V has reserved encoding space for (but not yet actually defined) a 128 bit ISA.

Besides, as @petrochenkov mentions, all kinds of of "hypothetically we want to support this kind of target but most people don't actually care", including this, can be handled by the portability lint stuff.

nikomatsakis commented 6 years ago

Then you can assume u32 <= usize <= u64 in the default configuration and whatever you specify by yourself otherwise.

This is what I assumed we would do.

clarfonthey commented 6 years ago

So I'm a tad confused; does that mean that From<u32> for usize is going to be implemented, or?

hanna-kruppe commented 6 years ago

IIIUC all From<uN> for usize will be implemented but conditional on the appropriate cfg(target_pointer_width=...). The portability lint will use that to warn, for example, about using From<u64> in general purpose code, or From<u32> if your crate opts into being compatible with 16 bit targets.

SimonSapin commented 6 years ago

PR https://github.com/rust-lang/rust/pull/49305 includes:

ltratt commented 6 years ago

FWIW, I doubt that usize <= u64 is a safe assumption for the future. There are interesting prototype/research platforms where it isn't true now (e.g. Cheri http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-907.pdf where pointers are typically 256 bits, though there is a 128 bit mode) and there's enough interest in these to suggest that we'll start seeing them in the real world sooner rather than later.

briansmith commented 6 years ago

FWIW, I doubt that usize <= u64 is a safe assumption for the future. There are interesting prototype/research platforms where it isn't true now (e.g. Cheri http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-907.pdf where pointers are typically 256 bits, though there is a 128 bit mode) and there's enough interest in these to suggest that we'll start seeing them in the real world sooner rather than later.

The problem there isn't the assumption that usize <= u64, right? The problem is assuming that usize can losslessly converted to and from any reference/pointer. Put another way, the problem is with overloading usize to mean both size and uintptr_t.

Serentty commented 5 years ago

I would definitely be against requiring pointers to be at least 32 bits, as there are all sorts of platforms that I'd love to develop for in Rust which have pointers smaller than that. However, I can't think of a single ISA with pointers smaller than 16 bits, as all of the popular 8-bit chips use two bytes for addresses. For that reason, I would be okay with requiring pointers to be at least that size. However, I would really rather the decision of how portable to be be left up to the programmer, rather than the language. If someone's code assumes that pointers are at least a certain size, that's their own decision.