Open ozra opened 8 years ago
non negative Int that is not binary unsigned
Firstly, what? And secondly, why is this needed?
Suggest IntBase
or AnyInt
instead of SomeInt
. Some
implies an Option = Some | None
kind of concept.
Not sure about Arr
. Why not just use the alternative syntax I propose in #31? After all, an array cannot exist without a concrete type.
What's a BufPtr
?
1) If you want a variable that only can hold natural numbers (including 0...), then you need a Natural data type. Unsigned is a data type used for technical legacy reasons (primarily with FFI, or direct hardware communication). It's far better to double the amount of bits used to extend range, than to change the meaning of one bit to double the range. Int's and Nat's can be compared without any confusion. At machine code level the same optimizations as used for unsigned can be used, as regards to muls becoming shifts, etc. since it is promised to be non-negative. Since it also stays in range of corresponding Int-width, it can also be optimized when used in combination with Int. Win win, performance. "Natural is more natural than unsigned". (In all honesty: on the binary level they are "unsigned", they simply limit their range to never touch the sign bit)
2) Agree - both those propositions sounds better
3) Arr probably just should be Array, sounds a bit weird, and static arrays aren't that common. As to alternative syntax, I've had a couple (for value creation though!) on a todo-list, I'll jump in on discussion in #31. Alternatively the word Array shouldn't be used - to minimize confusion!
4) Loose idea on a Ptr associated with a Buffer. It would be helpful for range-checks during development, as extra testing insuranve, and compiled to zero-overhead in release mode. As a parentheses, it could be helpful for certain GC's also.
So a Nat
takes more memory but is higher performance. But how many bits does it have?
The idea is simply that it follows the same pattern as Int: defaults to architecture pointer width (unless unreasonable for the specific arch), but can be chosen explicitly. And naturally (ha) also Nat32/Nat64 or N32/N64.
Why would I use a Nat
when I can use an Int
?
Where a number of natural type is needed, it's better to be able to type it in the program and get the help that types are for, instead of implementing range checks your self everywhere (and ifdef's to remove them for no-belts release speed). Indeces, sizes / dimensions, etc.
As a side note, Nat?
is also (thanks to the unused bit, and with a lot of code-gen hacking) also possible to store in the space of Nat
, which would give a typed ability to differentiate between say index
and not set
with same space and performance. So, there are quite a few perks.
As for the special treatment of allowing changing standard Int/Real with a "first thing" alias, I realise: it's a pragmatic solution, so of course that should be a pragma instead, and it should only be possible to choose the bit width, not a specific type (you can't, for instance, have BigInt as primary int type for practical reasons). Otherwise behave the same.
ed: It can only be set once for the entire program of course.
'std-real-width = 64
'std-int-width = 32
I don't really see the point. If I want a specific size of int, I'll just specify Int64
or something. Changing the size of a bare int, program-wide, is asking for trouble.
What's you reasoning?
Well, under what circumstances would anyone wish to use this feature?
And if they did, surely it would cause more harm than good? If I move a function from one file to another, suddenly it could be using a different type of int.
int
should be arch-dependant. And if I specifically require a 64-bit int, I will specify it at the point of use.
I see your concern: no; you're only allowed a choice for the whole program regarding width of the int/real - it cannot change for different files.
For most of the cases the default is fine.
But if for some reason you're, say, making some specialist application where a lot of integer division occurs, than a substantial speedup could be gained from switching from 64 bit to 32 bit.
Likewise, if someone is writing an app that works with rather big integer numbers all through, it could as well use Int
for cleaner appearance and simply declare that the int must be 64 bit for this program.
I edited the proposition, one does not want to change the arch-types that are set automatically...
Wouldn't it be far more readable to specify Int64
throughout if that's what the programmer intended? It just seems sloppy to redefine things like that.
What about built-in library functions? If I redefine Int
to be Int32
, and I call a library function that returns Int
, what do I get? Aren't library functions baked into the compiler when you build it?
If someone makes the change because of performance reasons, for instance, it's not the intended requirement of the code, simply a measure taken to improve run time. If it was to be deployed to another architecture it might handle those instructions blazingly, and then it's better to use different int width.
This is not supposed to be "used everyday", just as little as "returns-twice". It's a power user feature. I just think it's good to not lock the user out of the option of choosing. I'm keeping this open for debate still though.
Library functions are included in your program as source-"modules" just as the rest of the program and participate through the entire compile process, inference and all, so they will happily comply.
I've updated the OP:
SomeInt
to the suggestion AnyInt
which is better.
Thereby also Any
(mirroring Object
in Crystal)
Edited OP, look there for current namings.
Standard Types - Namings
NOTE - this is only partially implemented currently, and so, it's mostly a discussion of how/what to actually implement.
Since Onyx uses Crystal stdlib, there are already de facto names for common types. However, I feel there's need for some clean up out of the perspective of Onyx.
I'd like to favour terse names for the common types.
Proposed Type Names in Onyx
Nil
Any
Num
Int
Intd
ArchInt
, unless specified otherwise for a specific programReal
ArchReal
, unless specified otherwise for a specific programNat
List
Map
Tup
TTup
Set
Tag
Str
Bool
Ptr
ArchInt
ArchReal
Have I forgotten some obvious one?
"Machine Level" Data Types
Keeping these slick could be good, and also tell-tales their "machine-closeness" (do use "cleaner" types like
Intd
,Real
etc. for most things! These are for type-defs/performance/c-lib interfacing code).F32
F64
I8
I16
I32
I64
U8
U16
U32
U64
Suggested Definition of Arch* types
As you can see, heavily x86*-centric atm, has to be extended when other architectures are added.
Note that this pseudo-code is to showcase the definition, in reality it will be specified only as "bit-width for Int and Real, respectively", and not as aliases.
Thoughts?