Closed asterite closed 6 years ago
This will cause portability problems, and I really don't think thats a good idea.
I don't think this is actually going to happen either. The size of pointers is unlikely to change to be 128bit in the near future simply because having 16 exibytes memory is actually almost physically impossible. By the time physics gets to the level where we have to worry about 128bit address busses we'll all be dead.
Since the integer size of processors is closely related to the pointer size, I don't forsee support for more than native 256bit arithmetic in the future. I think a good bet would be 64bits as a good default integer size until basically forever
After some research, the performance impact may be negligible, it can even improve a bit. This needs confirmation for the Crystal case.
It's too bad now on 64bits machines to not use by default all available bits in the registers.
@RX14 Well, yeah, it's what I propose or at least changing the default int size to Int64... not sure how will that behave in 32 bits systems (yes, Crystal still supports those).
I forgot to mention that what I propose is how it works in Go and Nim. I didn't check other languages, thought...
For reference, Rust has is isize and usize and Golang int and uint.
Using 64bits integers on 32bits systems will undoubtly kill the performance, really bad idea.
armhf
is still there, and in general 32bits is and will be still widely used in efficient/embedded systems. One Crystal aim is to be efficient, some users target this machines because it fits perfectly on Raspberries, Arduino, routers...
@asterite 's proposal will bring better use of the underlying system's registers, bringing eventually more performance.
Hm, Rust's default integer type is Int32, and they say it's the fastest type, even in 64 bits systems. So that's something to think about...
@asterite The final remark you focused on refers to situations when a programmer isn't sure which type to pick. Though the general rule is
... the
isize
andusize
types depend on the kind of computer your program is running on: 64 bits if you’re on a 64-bit architecture and 32 bits if you’re on a 32-bit architecture.
@pkorotkov Right, but if we let 123
be of type Int64
in 64 bits, it might be a waste of size/performance.
It may be only on x86-64, that is x86 backward compatible and has optimizations to run it.
Even if at the end we stick with Int32
by default, I hope we can end up having Int
/UInt
or something similar to use the integer bits size that has the system (32 or 64).
Other languages brings this types, that's useful for lower level/platform optimization stuff.
@j8r We do have such type, it's LibC::SizeT
. Having a type for it is not a problem. The problem is that the current containers (and even files) have a limit of int32, where the limit should be Int64 (or UInt64) in 64 bits systems. And how to design the language in a way that you don't have to constantly cast stuff, or that a program compiles fine both in 32 and 64 bits, even though a type changes definition from one system to another.
@asterite I appreciate your concerns about possible performance hits, but real world apps indeed require machine-dependant lengths of containers.
Having a simple Int
type (and UInt
) instead of specifying Int32
would be very nice! Same for the Integer
alias, it's a very good idea.
Having an arch dependent integer size would help fix some inconsistencies (e.g. pointer address), and probably help with some C bindings, and yes, allow container types (String, Array, ...) to be capable to map more memory.
I'm not sure about making Int
be arch-dependent. You'd have to remember about it and that could lead to portability issues between 32 and 64-bit arch. But Go does that (edit: Swift too), so maybe it's not much of a concern? We can still use explicit Int32 or Int64 when needed.
Defaulting to 64-bit on all architectures would ruin program performance on 32-bit architectures, so I'm uncomfortable with that. It also feels like wasted space to me; I seldom need integers to be in the 64-bit range, but that's my personal impression. Oh well, I can just keep using Int32
.
Int
and UInt
for arch dependent integer sizes (see https://docs.swift.org/swift-book/LanguageGuide/TheBasics.html#ID317).Int32
when needed.The proposed RFC is good :+1:
+1
Maybe the standard library can use arch-dependant integer sizes correctly. I don't trust anyone else to run CI on 32-bit systems, simply because it's very hard to set up. For that reason, we shouldn't encourage or expose arch-specific integer types (outside LibC
) because without error-prone automatic casting like C, we will end up with all libraries defacto only working on 64-bits. And that's not a situation I want to be in. Extremely strong thumbs down on this whole feature.
Pointer
itself can have an arch-specific size, but manually extracting an integer type from it with Pointer#address
should always be 64 bits. Pointer arithmetic is usually done on pointer instances anyway, so it won't affect performance.
Most containers should either remain 32-bit or be switched to 64-bits. I feel very strongly that crystal should attempt to unify integer sizes and not expose a confusing array of aliases. Crystal is not a language for building system software, it is a high-level garbage collected language built for non-embedded computers. Even most phones are aarch64 these days. Removing support for 32bits entirely is entirely feasible in the next 5 years, some linux distros have done it already, and 128bit architectures will never happen. Let's be practical, and make things easier and less error-prone for the developer, instead of making Crystal become only "portable with effort", like C is. Most portable C programs have insane macros, and look ugly. Let's not take Crystal down the same path. The current status quo of number types in crystal is fine.
Swift uses arch-dependant default integer sizes because swift is specifically designed for objective-c compatibility, see also refcounting etc. I doubt swift would be designed as-is if not for those constraints.
Also note that in go, int and int32/int64 are seperate types (not just an alias which is being proposed here) which always require an explicit cast to go between. This avoids the compilation/portability problem, but doesn't avoid the issue of overflow. We could implement a separate type Int
and make it the default, but the diffs would be ginormous and it'd never get merged.
@RX14 that's a good point
I think the only container that effectively needs 64 bit size is Slice
, but that is really needed. Being able to load and access more than 4 gigabytes of data in memory is not a thing for the future but actually common in some applications.
Crystal should be able to use 64 bit sizes for Slice
, one way or the other. It probably doesn't need a platform-dependent integer type. Maybe it's fine to just use Int64
even on 32 bit platforms. Can anyone tell how much of a performance impact that would have?
I strongly believe that by the time anyone is using crystal on embedded linux, they will be using it on aarch64 or a similar 64bit architecture - simply due to economies of scale with the smartphone soc market. I just don't think 32bit is going to exist on any system crystal is targeting going forwards, so I'm really in favour of using 64bit for pointers and slices. Arrays, strings and other containers can be discussed later after we've agreed a strategy for slice.
For the record: Pointer
already uses UInt64
for addresses even on 32 bit platforms.
Assuming that 32-bit is dead is a bold assumption. I doubt Raspberry will drop its 32-bit models anytime soon, and they don't seem eager to have an arm64 Raspbian for the Pi 3.
Assuming that max integer size is always the pointer size is wrong. On Arduino pointers are 8-bit (or maybe it's 16?) but an int
is 32-bit. On ARM the FPU has 64-bit registers. Who can say in some years there won't be an arch with 64-bit pointers and 128-bit integers? Most languages already provide support for them.
I don't know if an Int
type should be arch-dependent. What I know is that lower-level languages, such as Rust chose to keep it a 32-bit integer, and to introduce an arch-dependent type (isize
and long
for C), whereas similar languages such as Go and Swift decided it to be arch-dependent.
Swift didn't have to:
int
is 32-bit and long
is arch-dependent).Swift could have fixed Int
to be Int64
and keep a Long
type around for Obj-C compatibility, yet, they didn't and instead chose for Int
to be arch-dependent. That's an interesting choice, and IMHO a realistic one.
@ysbaddaden even if 128bit arithmetic pops up as common (we already have an Int128
type), I doubt we'd want to make it the default type.
I'm also not thinking about dropping 32bit support, but I'm being realistic that 99%+ of the usage of crystal will be on desktops and servers. So taking a performance hit on 32bit is acceptable, especially if it's only around slices.
However, for the record, I think the status quo on int types is just fine too. The only thing that really needs to be changed is the types inside certain specific containers to 64bit. Any proposal to make architecture-dependant types common or recommended outside of C-interface is a huge no from me though.
32-bit users, of embedded or old devices, don't have to be relagated to second class citizens.
For now, for things limited by of the system's register/pointer size, like containers, a new system dependent type like IntT
and UIntT
can be used (or whatever name), based on LibC::SizeT
.
This will change nothing for 32-bit, and more space for 64-bit.
Like @ysbaddaden said, all isn't black and white. With RISC-V coming, and also OpenRISC, this opens even more possibilities.
The language adaptating itself a bit depending of the platform will allow to take more advantage of the system's characteristics.
@j8r This is no an issue as long as the type is only used internally. But with Slice#size
returning Int32
or Int64
depending on platform, you can end up having different types in your math operations and potentially risk overflows on 32 bit systems.
@straight-shoota good point. This could return an IntT
, and then casting can be made.
Or simply just let users cast with to_i
if they want reducing the risk of overflow.
But at first, if we agree to start using them progressively at some places internally, it would be a nice start 😀
An example may be #6640
The hypothetic IntT
could be a dedicated type, not just an alias, something like 9_t
and 9_ut
with #to_t
for casting.
Closing because I think this is a bad proposal. The way to go would be to have a separate int type that's not compatible/assignable from other int types, like in Go. That would make compilation on one system always compile on another system. But I don't think we can do that, or we'll do that.
Right now
Int32
is the default type for integer literals that don't have a type:We also have #4011, where
Array
is limited to anInt32
length... and in fact we also have this limit inString
,Slice
,Hash
and basically every container. I think this can indeed be a problem if someone wants to have a program with that memory. Maybe right now it's not that common, but as time passes maybe we'll have 128 bits machines and having more and more data will be common?In any case, we should have an integer type that depends on the platform and have containers that support such size.
To do that, I propose all of the following:
Int
type toInteger
Int
type that is an alias ofInt32
orInt64
, depending on the platformInt
type (though values larger thanInt32::MAX
will have anInt64
type)Slice
,Array
,Hash
andString
lengths to beInt
Int
in most places instead ofInt32
We can then also do the same with
UInt
(and have literals like123u
).This change doesn't have to be done now, but I believe eventually we'll have to do it.