vermaseren / form

The FORM project for symbolic manipulation of very big expressions
GNU General Public License v3.0
985 stars 118 forks source link

Make sure SIOsize is >= MaxTermSize #515

Closed jodavies closed 1 month ago

jodavies commented 1 month ago

Various buffers in AllocSort depend on the IOsize parameter. At the main level this is >= MaxTermSize, but this was not the case when sorting function args, dollar variables.

Fixes the examples in #80 #292

jodavies commented 1 month ago

I am unsure about adding a test for #80 : in that case failure means writing out enormous files on github actions...

coveralls commented 1 month ago

Coverage Status

coverage: 48.765% (-0.02%) from 48.788% when pulling f31628710de93ada37b0160e442e744a55c55b4c on jodavies:issue-80 into 74735f57a7351e5c790648f2332d06c164d701d0 on vermaseren:master.

jodavies commented 1 month ago

With 32bits, we can't have a MaxTermSize large enough for this test, it is limited by https://github.com/vermaseren/form/blob/0112aa6b21b48616575c6cd205ec00c7797dbc93/sources/setfile.c#L437

This test needs to be skipped for 32 bits probably, or I will try to make a version of the test with smaller expressions and buffer sizes that works for 32bit also.

vermaseren commented 1 month ago

How about a preprocessor variable that tells the size of a WORD?

On 9 May 2024, at 18:06, jodavies @.***> wrote:

With 32bits, we can't have a MaxTermSize large enough for this test, it is limited by https://github.com/vermaseren/form/blob/0112aa6b21b48616575c6cd205ec00c7797dbc93/sources/setfile.c#L437

This test needs to be skipped for 32 bits probably, or I will try to make a version of the test with smaller expressions and buffer sizes that works for 32bit also.

— Reply to this email directly, view it on GitHub https://github.com/vermaseren/form/pull/515#issuecomment-2102963326, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABJPCES33IIGNEGEEWWXZTTZBONH5AVCNFSM6AAAAABHO6ILHWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBSHE3DGMZSGY. You are receiving this because you are subscribed to this thread.

jodavies commented 1 month ago

Hi Jos,

At least in the test suite it is already easy to disable tests for 32bit word size.

I was actually trying to change the allocation to help with debugging these cases, but got stuck at some point. I want to split the large allocation in AllocSort into lots of smaller ones, so that buffer overruns there will trigger a valgrind error. But it seems that some buffers at least, must be contiguous: the large buffer + extended small buffer, for instance. Are there any others that are used in a combined way?

vermaseren commented 1 month ago

Hi Josh,

I am not sure. There are more than 500 calls with Malloc1. Most are clearly harmless. If a compiler buffer has something, it should also be harmless. There may be something in the checkpoints, but again, that should not be an issue here. Also the scratch files should be OK, because a bracketbuffer is not really part of that.

The reason the sort buffer is set up like that because it can be used this way as one giant cache. But as far as I know that should be all. But many things are more than 30 years ago. Actually this year it is 40 years ago I started Form. Imagine….

On 9 May 2024, at 19:35, jodavies @.***> wrote:

Hi Jos,

At least in the test suite it is already easy to disable tests for 32bit word size.

I was actually trying to change the allocation to help with debugging these cases, but got stuck at some point. I want to split the large allocation in AllocSort into lots of smaller ones, so that buffer overruns there will trigger a valgrind error. But it seems that some buffers at least, must be contiguous: the large buffer + extended small buffer, for instance. Are there any others that are used in a combined way?

— Reply to this email directly, view it on GitHub https://github.com/vermaseren/form/pull/515#issuecomment-2103112523, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABJPCERK2SUMLGLCJOEBYDDZBOXV7AVCNFSM6AAAAABHO6ILHWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBTGEYTENJSGM. You are receiving this because you commented.

tueda commented 1 month ago

At least in the test suite it is already easy to disable tests for 32bit word size.

It is actually "difficult" because now FORM doesn't give any information about 32/64bit directly, and that's why by make check you are always warned as

warning: failed to get the wordsize of '../sources/form'
warning: assuming wordsize = 4

It was "easy" in version 4.3.

jodavies commented 1 month ago

Sure, but still it is easy to use #require wordsize == 4 right? Anyway, at least in this case the test is now suitable for 32bit builds.

While investigating this I noticed that the "default" buffer sizes are somehow "not good", since AllocSort alters a lot of them before making the actual allocation. It is a bit hard for the user to know if their specified buffers are actually used in the end. Maybe some warnings could be printed from there (at allwarnings level?) for interested users...

tueda commented 1 month ago

Sure, but still it is easy to use #require wordsize == 4 right?

Right. Indeed, for 32-bit containers, we inform the test suite that the FORM binary uses wordsize = 2.

https://github.com/vermaseren/form/blob/0112aa6b21b48616575c6cd205ec00c7797dbc93/.github/workflows/test.yml#L280

While investigating this I noticed that the "default" buffer sizes are somehow "not good", since AllocSort alters a lot of them before making the actual allocation. It is a bit hard for the user to know if their specified buffers are actually used in the end. Maybe some warnings could be printed from there (at allwarnings level?) for interested users...

Probably developers also need this. In some test cases, we fix many buffer sizes to reproduce specific conditions. If the way to alter the buffer sizes is changed in future versions (which may be triggered by changing the default settings for other buffers, like this), then our assumptions about the fixed buffer sizes could easily be invalidated. We need errors when the default or specified parameters are altered to different values.

jodavies commented 1 month ago

Errors might be a bit of a pain for regular use though (and like the tests, various real-world scripts will suddenly stop running), but maybe a mode that can be enabled? On ExactBuffers; or something.

tueda commented 1 month ago

Maybe this PR is ready to merge and indeed has a high priority?

jodavies commented 1 month ago

Yes, this one is important. Several of the recently committed tests are effectively implementing this by manually setting SubSortIOSize to be able to run. Should we remove those manual settings again, along with this?

tueda commented 1 month ago

I think for now we can keep the manual buffer settings in the tests (unless they conflict with the automatic settings).

I will merge this PR as well.