carbon-language / carbon-lang

Carbon Language's main repository: documents, design, implementation, and related tools. (NOTE: Carbon Language is experimental; see README)
http://docs.carbon-lang.dev/
Other
32.32k stars 1.48k forks source link

What should we call integer types with mod-2^N arithmetic? #2152

Open geoffromer opened 2 years ago

geoffromer commented 2 years ago

p1083 and the current design both say

Unsigned integer types wrap around on overflow, we strongly advise that they are not used except when those semantics are desired.

That being the case, it seems unhelpful or even misleading to refer to these types as "unsigned integer types", and spell their names Carbon.UInt(N)or uN. This terminology will encourage users think of them like C++ unsigned types, and to focus on the lack of a sign bit rather than on the wrapping semantics. This will inevitably steer them toward exactly the kinds of uses that we're trying to discourage, namely to represent quantities (like sizes) that are guaranteed to be non-negative but do not naturally have wrapping semantics. Conversely, terminology that focuses on the wrapping semantics would reinforce the principle quoted above; even if we can't achieve that, almost any terminology other than "unsigned" would at least avoid actively undermining that principle.

Some alternatives to consider:

Terms of art:

Type names:

Type shorthands:

To elaborate on the last option: the use cases for mod-2^N integers are far less common than the use cases for ordinary integers, so the need for a shorthand seems less compelling, and omitting a shorthand would help encourage people to use the signed integer types by default.

None of these options seems fully satisfactory, but I think many of them would be a substantial improvement on the status quo.

tkoeppe commented 2 years ago

On a tangent: "modular" (in maths) doesn't imply unsigned, since the choice of representative of the residue class is arbitrary. You could just as well have modular/"wrapping" signed integers.

On another tangent, since the mod-2^N equivalence is not order-preserving, ordering doesn't make sense for "modular" values, so should/can we make relational expressions ill-formed for those types?

geoffromer commented 2 years ago

On a tangent: "modular" (in maths) doesn't imply unsigned, since the choice of representative of the residue class is arbitrary. You could just as well have modular/"wrapping" signed integers.

Sure, but in practice the chosen representative is virtually always the smallest non-negative member of the class. I don't think this will cause any confusion.

On another tangent, since the mod-2^N equivalence is not order-preserving, ordering doesn't make sense for "modular" values, so should/can we make relational expressions ill-formed for those types?

I don't think we should do that, because I suspect it would break important use cases. So I think "wrapping" is a more strictly accurate description of these types than "modular", but on the other hand "wrapping" is a little more ambiguous about whether there's a meaningful sign bit, and also somewhat more prone to colliding with other meanings of the word.

tkoeppe commented 2 years ago

I don't think we should do that, because I suspect it would break important use cases. So I think "wrapping" is a more strictly accurate description of these types than "modular", but on the other hand "wrapping" is a little more ambiguous about whether there's a meaningful sign bit, and also somewhat more prone to colliding with other meanings of the word.

In that case I'm sympathetic to option "none"; just don't provide wrapping semantics via operators at all and instead provide normal library functions (on unsigned types) that implement modular arithmetic (as well as other kinds of arithmetic, e.g. saturating).

github-actions[bot] commented 1 year ago

We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please comment or remove the inactive label. The long term label can also be added for issues which are expected to take time. This issue is labeled inactive because the last activity was over 90 days ago.