Signed extension planned?

ckormanyos / wide-integer

Wide-Integer implements a generic C++ template for uint128_t, uint256_t, uint512_t, uint1024_t, etc.

Boost Software License 1.0

190 stars 31 forks source link

Signed extension planned? #10

Closed ckormanyos closed 3 years ago

ckormanyos commented 4 years ago

The unsigned integer types are present. Are there any potential plans for a signed integer type?

johnmcfarlane commented 3 years ago

TBH, I'm mostly lurking in this repo just hoping for signed to happen.

CNL's wide_integer and duplex_integer are a mess. They're buggy, they ruin compile times and they generate acres of code. All algorithms involved are either naive or, at times, even less efficient than that!

If I could lift from anywhere a lightweight constexpr solution, which basically picks up where __int128 leaves off, it would save me a huge amount of effort and get CNL much much closer to its long-term goals, i.e. the static_number type. It would also give this project a lot of constant road testing. :)

I don't think that unsigned alone is sufficient reason to integrate wide-integer with CNL, because I'd still need to do most of the work again for signed -- which is only as hard or harder to implement. (So I totally understand why you're holding off.)

ckormanyos commented 3 years ago

hoping for signed to happen

Hi John, thanks for lurking and for your query. Yes, I have held this off for quite some time now.

I feel confident that I would be up to the challenge of intwide_t. But @johnmcfarlane could we decide a few things? Then I can get started. We can, of course change our minds later, but I could get on the task if we decide a few things.

Top-level design choice:

Carry an extra Boolean sign as an actual member of the class?
Or integrate the sign information in a modulo 2 unsigned highest order limb?
Or use 1 single signed limb and n - 1 unsigned limbs

We briefly discussed this in the past, but we were (and also I was) unconclusive.

I guess I slightly prefer number 2.

What are your thoughts?

johnmcfarlane commented 3 years ago

Great! I opted for 3 in my implementation, which basically built a tree of limbs, e.g. ...

pair<
  pair<
    pair<signed, unsigned>, 
    pair<unsigned, unsigned>>,
  pair<
    pair<unsigned, unsigned>, 
    unsigned>>

...for a 224-bit-wide integer. (It's actually duplex_integer, not pair but they're equivalent.) It's simple but probably very naive I'm sure. Note the signed limb there.

In theory, it doesn't matter all that much so long as the behavior is as if it's a really wide native integer. If the structure can be tightly packed and still efficient, that's a benefit because users can treat it more like native types then. So my vote is for either 2 or 3. I will mostly wrap this in another type (wide_integer) so such details won't often be user-facing.

Other questions:

can/should it be able to use __int128_t limbs. (Is that even possible or do you use them for the long multiplication?)
are widths powers of two only? I.e. would the example above be allowed / desirable.

I'm more curious about these questions than anything.

Thanks John

ckormanyos commented 3 years ago

Lots of information here and lots of open questions. But I'm a fan of adapting and evolving as we go...

structure can be tightly packed and still efficient, that's a benefit because users can treat it more like native types

OK.

can/should it be able to use __int128_t limbs. (Is that even possible or do you use them for the long multiplication?)

I never really tried. My gut feeling is that for storage, 128-bit limbs could work and even be a benefit. But I fear it would be necessary to unpack them into smaller half-types for mul/div operations, as these atomic parts are needed for carries.

are widths powers of two only? I.e. would the example above be allowed / desirable.

These are questions I'd be interested to know if SG6/SG14/LEWG weigh in on these as well. For the moment in uintwide_t, I opted for bit resolution with a specific fine, but non-single-bit granularity having bit counts of $1{\ldots}63{\times}2^{N}$ , while being 16, 24, 32 or larger. I think this is a good compromise on bit granularity versus implementation complexity.

But, as mentioned above, we can evolve as we go.

johnmcfarlane commented 3 years ago

But I fear it would be necessary to unpack them into smaller half-types for mul/div operations, as these atomic parts are needed for carries.

That's what I suspected. I cannot use the widest type as a limb for the same reason.

These are questions I'd be interested to know if SG6/SG14/LEWG weigh in on these as well.

I suspect that Chapter 6 of P1889, which in turn points back to P0539R4 is the latest word on this. However,

I think this API is far from being final, and
CNL is all about dividing concerns into separate types and wrapping them in one another; it would be easy to adapt a simple, general-purpose wide integer in a new API if that were necessary.

So don't feel beholden to this design from a standards PoV. We're entitled to explore the design space and provide feedback to the committee.

For the moment in uintwide_t, I opted for bit resolution with a specific fine, but non-single-bit granularity having bit counts of $1{\ldots}63{\times}2^{N}$ , while being 16, 24, 32 or larger. I think this is a good compromise on bit granularity versus implementation complexity.

The purpose of a human interface is not to hide what the code does but to accurately convey what the code does. My interpretation of this would be that it's OK if the interface specifies NumLimbs, rather than Digits2 because that's the natural 'currency' of this type.

As a result, fixed_static_array is closer to what I think is needed. Obvious the API would need to change. I'd recommend simply accepting a signed MyType. Then the first limb might be MyType (depending on choices #2 vs #3) and the remaining limbs would std::make_unsigned_t<MyType>.

The other thing missing is constexpr. I'm looking at how easy it would be to make fixed_static_array a literal type, i.e. usable from a constant expression. This is an incredibly useful quality. It becomes much easier to achieve in later language revisions. Even C++14 might be enough to convert over with relatively little change. I'm happy to look into that. What are your thought there?

ckormanyos commented 3 years ago

OK in light of all these considerations, I suggest the following.

I have needed the signed wide integer class mentioned in this thread (in a fast, new, clean form) for a long, long time, so let's actually get specific on it.

@johnmcfarlane consider the points below and plese add your thoughts to the list(s) below?

Type is a signed wide integer type that starts where int64_t ends.
Prefer the interface in P0539R4.
Support any bit size above a sensible lower limit.
Strive for constexpr construction and binary algebraic and comparison operations.
Implementation language C++20.
No need for any higher-level number theoretical functions such as GCD, primality testing, k`th root, etc.
Keep dependencies low strive for single-header design.
Embedded-friendly design with low memory code/RAM footprint, at least for low-to-moderate bit counts.

As for implementation details, I recommend the following:

Create a new project, something like wide-signed-integer.
At the moment, uintwide_t uses a rather unconventional storage of least-significant limb being located at limb position 0. Reverse this ordering to use the more standard order found more commonly of most-significant limb located at limb position 0.
At the moment, uintwide_t uses uint_fast32_t for the bit-count template parameter. This is because size_t is only 16 bits wide on some of the platforms using uintwide_t and more bits were actually desired in extreme situations. In this design mentioned in the thread here, however, adhere to size_t as the type for the number-of-bits parameter.
Do prepare for higher-order multiplication Karatsuba and Toom-Cook, but these are not needed in the first version(s).

To those following my work in Boost.Multiprecision, this differs from Boost's cpp_int partly/predominantly in points 1,2,5,6,7,8. It is hoped that this work will not compete with that work, but rather complement it and generally move the standardization process forward.

johnmcfarlane commented 3 years ago

Prefer the interface in P0539R4.

You're welcome to pursue that design but to be clear, I don't favour it or need it. I'm interested in something like fixed_static_array which is much simpler: just NumLimbs, not Digits2 makes everything a lot simpler.

Support any bit size above a sensible lower limit.

Again, if you're just specifying number of limbs, I assume this is moot. I'm happy to help rethink uintwide_t but still not convinced I'd use a P0539-like type.

Strive for constexpr construction and binary algebraic and comparison operations.

I hope you agree the constant expression experiment is making progress. One think I'd invite you to try out is compile-time unit testing. You can perform a lot of tests using static_assert once you have a literal type. (example)

Implementation language C++20.

You need to balance the desire to rely on new C++ features with the needs of users who rely on older toolchains. What would be the win from dropping C++17?

No need for any higher-level number theoretical functions such as GCD, primality testing, k`th root, etc.

I cannot say. I have observed that people pass over my library because it lacks basic trig functions. Building up a user base is important for a number of reasons and if they appreciate certain functions, then they're important for the library.

Keep dependencies low strive for single-header design.

Not vital but definitely has some benefits. E.g. if it wasn't for _util_dynamicarray.h, you could probably include the project right into compiler explorer: https://godbolt.org/z/c9Kr5z5q9

Create a new project, something like wide-signed-integer.

What would be the main benefit of doing this? Would it be a fork of this repo, with more gradual changes? Do-overs are notoriously risky and costly. And did you consider switching to an org? I've floated that idea here.

At the moment, uintwide_t uses a rather unconventional storage of least-significant limb being located at limb position 0. Reverse this ordering to use the more standard order found more commonly of most-significant limb located at limb position 0.

One benefit of that would be that your ordering becomes lexicographical.

At the moment, uintwide_t uses uint_fast32_t for the bit-count template parameter. This is because size_t is only 16 bits wide on some of the platforms using uintwide_t and more bits were actually desired in extreme situations. In this design mentioned in the thread here, however, adhere to size_t as the type for the number-of-bits parameter.

I see no need to use fast variants for template parameters. I've always favoured int for all quantities. But you bring up a good point: shouldn't both the libraries maybe go with something wider because of AVR?

ckormanyos commented 3 years ago

Hi @johnmcfarlane it is time to talk signed interface.

Today's uintwide_t has the following template signature:

  // Forward declaration of the uintwide_t template class.
  template<const std::uint_fast32_t Digits2,
           typename LimbType = std::uint32_t,
           typename AllocatorType = void>
  class uintwide_t;

I can use a fourth template parameter for signed.
We could also create a base class with signed/unsigned derived classes.
Another class entirely, such as intwide_t, whereby the bulk of algorithms are extracted from uintwide_t and subsequently used by both the signed and the unsigned type.

I kind of favor 1 or 3. In one, I'm not sure where to put that 4th signed-ness template parameter. 4 makes that easy(er), but then might have some redundant code.

in each of these, I am nut sure how much if any interoperation between the signed/unsigned type to have. The base class, on the other hand, might make that easier, but might make it harder to get constexpr-ness.

Your thoughts?

johnmcfarlane commented 3 years ago

Have you considered aliases? For example, you could write

template<
    long Bits = 32,
    typename TopLimbType = std::int32_t,
    typename LimbType = std::make_unsigned_t<TopLimbType>,
    typename AllocatorType = void>
class wide_integer;

template<long Bits = 32, typename AllocatorType = void>
using uintwide_t = wide_integer<Bits, std::uint32_t, std::uint32_t, AllocatorType>;

template<long Bits = 32, typename AllocatorType = void>
using intwide_t = wide_integer<Bits, std::uint32_t, std::int32_t, AllocatorType>;

Do you require that all limbs are the same width? If so, you could just rely on three parameters:

template<
    long Bits = 32,
    typename TopLimbType = std::int32_t,
    typename AllocatorType = void>
class wide_integer
{
    using limb_type = std::make_unsigned_t<TopLimbType>;
};

template<long Bits = 32, typename AllocatorType = void>
using uintwide_t = wide_integer<Bits, std::uint32_t, AllocatorType>;

template<long Bits = 32, typename AllocatorType = void>
using intwide_t = wide_integer<Bits, std::int32_t, AllocatorType>;

Some bikeshedding notes:

I chose Bits because I use Digits to mean something more particular in CNL: the number of binary digits excluding the two's complement 'sign' bit. This is in line with the meaning of numeric_limits<T>::digits and avoid confusion with Width or Bits which is what you find in <cstdint> types.
I chose long following the conversation about 16-bit architectures. I currently use int but you've persuaded me I should pick something less narrow. I also follow the convention that this is an arithmetic parameter and so should be signed.
You might, however, argue that this is really a bit manipulation variable and that unsigned is correct. In that case, std::size_t or std::uint32_t is probably the right choice.
You might prefer to stick with Digits but then have 31 digits in an int32_t. Does that make sense? (Then I think it's more likely to be a signed type like long.)
wide_integer is that this would be called in CNL, but maybe you could call these intwide_t, uintwide_t, sintwide_t. Or something else.

ckormanyos commented 3 years ago

Thank you for all that input. I will study, think and also brush up on binary 2's complement arithmetic, like complement, arithmetic shift. That is probably where the diffs are.

persuaded me I should pick something less narrow

To this, yet another, perhaps even larger issue. For-like-ever, std::numeric_limits<T>::digits has the type int. If you really follow my lamentations about int being 7, 15, 31, 64 digits wide on platforms of whatever, then the deeper thought might be what to do with good old <limits> in light of really wide integral, floating-point and fixed-point types. This seems to be, in fact, the one problem in 14882 that I can not for years seem to effectively work around.

As for the above comments, I will slowly work through them.

johnmcfarlane commented 3 years ago

This seems to be, in fact, the one problem in 14882 that I can not for years seem to effectively work around.

See the email I just sent to Walter Brown. There is a replacement for numeric_limits proposed. I also implement something similar in CNL, digits<T>, which I now realise I'll have to change as well!

I think it might be better to go with a language type (e.g. long or long long) instead of a fixed-width type (e.g. int32_t or int64_t) because they might be different underlying types on different architectures. Relatively minor concern.

I will study, think and also brush up on binary 2's complement arithmetic, like complement, arithmetic shift. That is probably where the diffs are.

I would think about starting from (or at least planning for) the other end of the sausage factory:

Algorithms (much like those in <algorithms> but for numbers. (Maybe take a quick look at P0237. The author claims that it will be of great help in creating multi-word algorithms.) These might look like:
```
template<typename Word>
Multiword add(span<Word> const& lhs, span<Word> const& rhs);
```
Make sure these are exactly what you need, that they're as simple as possible. If you think you can more efficiently implement addition of words of different lengths, then change this to:
```
template<typename Word1, typename Word2>
auto add(span<Word1> const& lhs, span<Word2> const& rhs);
```
Get these right first because if they're wrong, you'll end up with a suboptimal solution. You aren't doing users any favours by trying to make these functions easier to use if they don't "accurately convey what the code does".

You probably cannot even start until you know whether you're going to have a signed top word. span might be completely wrong because of this; it's just a placeholder here.
Consider one or more wrappers which do nothing other than provide storage, conversion and arithmetic operators for (1). Similar to fixed_static_int etc.. It shouldn't matter to the functions in (1) whether the data is on the heap or the stack.
Consider a wrapper which does nothing more than abstract away words behind digits/bits by wrapping types in (2). Similar to uintwide_t etc..

ckormanyos commented 3 years ago

Work is being done in #54. At first a simple IsSigned parameter has been introduecd. Let's see if I can get all the signed/unsigned math right? That's my first challenge. If, after that, we would like to tweak the params, should be at that time no problem.

ckormanyos commented 3 years ago

Hi @johnmcfarlane

While implementing, I encounter some design nuances that should be decided. (I ask you John directly, since you are expected to be one of the first preliminary clients of this work.)

What to do when reading an input string for a signed integer in binary, octal or hexadecimal format that has a leading minus sign?
When printing a signed integer with a negative value in either binary, octal or hexadecimal, should a minus sign be printed?
When printing a signed integer in base 10 that has a negative value, but the ios-flags have ios::showpos set?
When performing right shift of a signed integer with a negative value, should arithmetic right shift be done (i.e., fill with 0xFF)?

johnmcfarlane commented 3 years ago

What to do when reading an input string for a signed integer in binary, octal or hexadecimal format that has a leading minus sign?

Firstly, I don't know that I'd be using any such text conversion utilities. A fixed_static_array-like type would be used in CNL interchangeably with fundamental integer types. So CNL already provides type-agnostic text conversion utilities.

Secondly, it would help to answer these questions knowing whether you've chosen to use a signed limb or not. I think it's a little early to be deciding on the individual characters of a string before knowing what the individual limbs are and, accordingly, how numbers will be constructed.

I would not, personally, construct them from strings, but from limbs, and emulate a standard API, such as from_chars. Such a function would calculate the limbs and then construct the number from them. But what would be the type(s) parameter(s) to the constructor? A pack of integer types? An initialiser list? Some library-defined POD type?

If you were emulating from_chars you could follow the rule: "only the minus sign is recognized (not the plus sign), and only for signed integer types of value." IOW, that could be a hard bug.

Longer-term, you might consider supporting a UDL, e.g. auto x = 12345678901234567890123456789012345678901234567890123456789012345678901234567890_library_specific_suffix.

When printing a signed integer with a negative value in either binary, octal or hexadecimal, should a minus sign be printed?

Certainly. But again, see what the standard library does. It doesn't always do the best job and you probably want to favour newer -- rather than older -- APIs because they may be more embedded-friendly (e.g. to_chars and from_chars) but at least they've done the work of asking and answering these questions. (That doesn't mean I'm not more than happy to contribute or clarify but you'll likely get a better answer quicker from the docs!)

When printing a signed integer in base 10 that has a negative value, but the ios-flags have ios::showpos set?

Again, do what the ints do. I've never used it but I assume showpos has no effect on negative sign because otherwise, it would produce wildly incorrect results.

When performing right shift of a signed integer with a negative value, should arithmetic right shift be done (i.e., fill with 0xFF)?

If in doubt, "do what the ints do". (It's a Scott Meyers mantra.) Docs can be found here. "...right shift on signed a is arithmetic right shift..."

But first, you need to design your type. (4) is the closest to timely here. Those other questions probably come later. You may find that arithmetic right shift cripples performance or otherwise hurts usability. If that's the case, thing twice!

ckormanyos commented 3 years ago

first, you need to design your type

Well, my (perhaps naive) plan is that the first implementation will use all unsigned limbs. This is exactly the same internal representation as the unsigned type.

The only functions, then, actually needing to be changed are:

I/O streaming, right shift, division and comparison.
Some extremely rudimentary interpoeration among the signed/unsigned type(s).

Those other questions probably come later.

I get the feeling we might have different scopes of change in mind. The scope of change I am talking about for the signed type is something like a fun weekend activity, a few testing patches, maybe some back and forth on the non-class binary interoperations signed/unsigned add, sub, mul, div, compare, disable sqrt(negative), etc. And as such, there will not really be much of a later, as this will be done relatively soon.

johnmcfarlane commented 3 years ago

Well, my (perhaps naive) plan is that the first implementation will use all unsigned limbs. This is exactly the same internal representation as the unsigned type.

That might be entirely the right choice. I have no idea. You're the expert!

I get the feeling we might have different scopes of change in mind. The scope of change I am talking about for the signed type is something like a fun weekend activity, a few testing patches, maybe some back and forth on the non-class binary interoperations signed/unsigned add, sub, mul, div, compare, disable sqrt(negative), etc. And as such, there will not really be much of a later, as this will be done relatively soon.

Yes, I think I've underestimated the amount of functionality that is embedded in the uintwide_t type. But I think that it could be embedded within the cnl::wide_integer type -- at least to find out where the wrinkles are. Ideally, a wide integer type wrapped in another wide integer type is not ideal. That's the thing we can maybe defer. Thanks for the clarification!

ckormanyos commented 3 years ago

But I think that it could be embedded within the cnl::wide_integer type -- at least to find out where the wrinkles are. Ideally, a wide integer type wrapped in another wide integer type is not ideal. That's the thing we can maybe defer.

Yes. That is, indeed, exactly what I mean. At first I will implement a simple, non-breaking extension of uintwide_t to get some kind of a usable signed type rather sooner than later.

As a second step, after the numbers work out, and feasibility in cnl hopefully harmonizes, we take a deeper dive into actually quite a few of the more ideal steps you have mentioned in a variety of threads including better separation of the type(s), separated operator abstractions, etc.

ckormanyos commented 3 years ago

Fixed by #54

Hi @johnmcfarlane the preliminary implementation of signed is in master. I decided to use a (slightly oddly placed) 4th template parameter for optional signed-ness of uintwide_t. There are some optimizations needed such as global ops with built in integral types and construction form/cast to built-in floating point types. But these were missing from the unsigned type also.

Please be aware that this is new code. We might need to find a bug or two. But CI and preliminary tests are looking good.