Closed albinahlback closed 4 months ago
Not sure if
fmpz_set_uiui
andfmpz_get_uiui
and friends should be completely deinlined. They feature in some rather important loops.
Sure. I'll try to figure it out.
Does this work? I assume that the fmpz-collector never pushes anything with less than two limbs. I will just check that it indeed is the case for all files in fmpz/link/
.
The PR got a little bit bigger than I intended. Notably, I inserted an assertion that any returned mpz in _fmpz_clear_mpz
has at least two allocated limbs.
Just as to note: Currently, with FLINT_TEST_MULTIPLIER
larger than something around 5.5, the test for fq_nmod_mpoly_factor
fails due to a multivariate polynomial not being canonical. I don't know if that is present in the current state of the master branch, but this is something to look into.
Perhaps we should increase the test multipliers for runners with assert.
Not sure if we really want to change all
mpz_realloc(x, alloc);
to
if (x->_mp_alloc < alloc)
mpz_realloc(x, alloc);
as the former can actually free unused memory. This might be desirable, at least in some of the places where it is used.
Not sure if we really want to change all
mpz_realloc(x, alloc);
to
if (x->_mp_alloc < alloc) mpz_realloc(x, alloc);
as the former can actually free unused memory. This might be desirable, at least in some of the places where it is used.
Then that should be handled by the fmpz collector. We have set it to 64 limbs, which I think is a reasonable number.
Then that should be handled by the fmpz collector. We have set it to 64 limbs, which I think is a reasonable number.
Yes and no. What can happen in a long running calculation with mixed size numbers is that all cached mpzs eventually saturate the limit and use far more memory than needed when recycled for smaller operands. But if we avoid this problem at all right now, it is probably by accident.
Then that should be handled by the fmpz collector. We have set it to 64 limbs, which I think is a reasonable number.
Yes and no. What can happen in a long running calculation with mixed size numbers is that all cached mpzs eventually saturate the limit and use far more memory than needed when recycled for smaller operands. But if we avoid this problem at all right now, it is probably by accident.
Sure, I agree with this. But I don't think that is the responsibility of the "smaller" arithmetic routines.
I hope that you can agree with me that the responsibility could fall on
_fmpz_vec_content_chained
), andSure, I agree with this. But I don't think that is the responsibility of the "smaller" arithmetic routines.
Sure enough, GMP has a MPZ_REALLOC
macro. We could have a similar macro (or inline function). It would then be easy to adjust the global behavior.
However, consider a case like this in fmpz/bit_unpack.c
(original code):
/* allocate space for l limbs only */
mpz_realloc(mcoeff, l);
Here the intent was clearly to resize tightly. Indeed, this makes sense when you are unpacking a big polynomial; in that case, you want to avoid excess allocation for every coefficient.
Please let me know if you can identify some other functions which may benefit from such reallocations.
Looks OK except that the names MPZ_REALLOC
and MPZ_DO_REALLOC
don't really suggest what their difference is, and we should be careful about intruding on the GMP namespace. Maybe rename
MPZ_REALLOC -> FLINT_MPZ_REALLOC or FLINT_MPZ_REALLOC_AT_LEAST
MPZ_DO_REALLOC -> FLINT_MPZ_REALLOC_TIGHT or FLINT_MPZ_REALLOC
Good point.
Will merge if the tests passes.
Not sure if
fmpz_set_uiui
andfmpz_get_uiui
and friends should be completely deinlined. They feature in some rather important loops.