flintlib / flint

FLINT (Fast Library for Number Theory)
http://www.flintlib.org
GNU Lesser General Public License v3.0
446 stars 245 forks source link

Only 4 bytes are being used for storing size #1908

Open Cyclopropinon opened 7 months ago

Cyclopropinon commented 7 months ago

In the documentation of fmpz_out_raw in Version 3.2.0 I saw that the size is stored in 4 bytes. What happenes to numbers greater than 2^(2³²-1)-1? grafik imo it would be better suited to use 8 bytes to indicate the size to allow even bigger integers

albinahlback commented 7 months ago

It stems from GMP's definition of __mpz_struct, where the allocation and size fields are ints. Now, int doesn't have to be four bytes, but for most systems they are.

Now, changing this from four to eight bytes would not only break FLINT, but it would also break the convention that GMP has established.

Moreover, sizeof(__mpz_struct) would no longer be be a power of two, which is nice for memory alignment. As a side note, the most performant CPU's for multiple precision arithmetic, namely x86_64 and 64-bit Arm systems, specifically benefits from __mpz_struct being 128 bytes. For x86_64, we have that multipliers in memory operands can be 16, and Arm has ldp so it can load 128 bytes in one instruction.

albinahlback commented 7 months ago

There is nothing stopping you from using bigger numbers than $2^{32} - 1$, but then you'd have to use the low-level interfaces instead (see GMP's low-level interface).

If you ever encounter a situation that requires bigger integers than this number, you are probably developing some specialized system since each integer of that size is at least 4 GB.

fredrik-johansson commented 7 months ago

Note that this limit is in bytes, not in bits or limbs. An mpz can actually go up to 2^37 bits, or 41 billion digits, before overflowing. The mpz_out_raw overflow occurs already at 2^34 bits, or 5 billion digits, making it quite a lot easier to hit.

I've run into this limitation before and had to work around it. All the data is written, but the size info is incorrect, so you'll have to recover it from the file size. We could change the format for this specific function but that would break compatibility with GMP, if that is an issue.

It would be nice to rewrite fmpz to get rid of these 32-bit restrictions entirely, but it's a lot of work.

albinahlback commented 7 months ago

I mean, for specialized purposes (such as your case), one can just compile GMP and FLINT with the fields to long int, no? Or are there hardcoded things in the code that one has to fix as well?

fredrik-johansson commented 7 months ago

I mean, for specialized purposes (such as your case), one can just compile GMP and FLINT with the fields to long int, no? Or are there hardcoded things in the code that one has to fix as well?

There are probably a few ints that need to be changed manually. Either way, it ought to work out of the box. It's silly that we support 64-bit sizes pretty much everywhere except in integers.

albinahlback commented 7 months ago

I suppose that it could speed some things up to avoid zero extensions (i.e. promotion from 32-bit unsigned to 64-bit unsigned). Still don't like having such this struct not being 128 bits.

fredrik-johansson commented 7 months ago

There are solutions that don't require having a bigger mpz structure. IIRC the GMP developers have planned (but so far not implemented) a hack whereby the size field can overflow to a 48-bit integer while the alloc field becomes a 16-bit float. It's going to be a headache to test though, especially because only huge operands are affected.

Another solution is to have an fmpz point directly to limbs, where the first two limbs act as a header (size + alloc), avoiding the second pointer. The disadvantage is that you can't mock up a "view" integer that points to some existing limbs without making a copy to prefix the header.

There's maybe another, much more hacky and less portable solution. At least on x86-64, the high 16 bits of a pointer are never used. In principle, this means that you could use the high bits of a pointer fmpz for inlined 7-bit size and alloc fields. Then you go to an mpz-like structure with full 64-bit sizes only when the numbers are larger than 64 limbs or so. That would be easier to test and maybe even faster for numbers with only a few limbs.

Either way, you will end up with some more branches though, and that might be a bigger performance killer than the struct sizes.