vermaseren / form

The FORM project for symbolic manipulation of very big expressions
GNU General Public License v3.0
1.14k stars 136 forks source link

[tform] reading a bracket may crash with B+ when the expression doesn't fit in the scratch buffer #165

Closed tueda closed 7 years ago

tueda commented 7 years ago

Here is a small example that exhibits accessing a bracket in tform may crash with B+, depending on the scratch size. The original program had ~ 1GB expression as the input. Instead, a small scratch size is used in the following:

#:MaxTermSize 200
#:ScratchSize 12800
CF f,g;
S n;
L F = <f(1)>+...+<f(100)>;
multiply <g(1)>+...+<g(100)>;
B+ f;
*B- f;  * <-- (1)
ModuleOption noparallel;
.sort
id g(n?) = F[f(n)];
P;
*ModuleOption noparallel;  * <-- (2)
.end

Running this program with, e.g., tform -w4 easily crashes. Note that enabling line (1) or (2), or increasing ScratchSize to ~ 160000 cures the situation.

tueda commented 7 years ago

Another way to avoid the bug is: undefine WHOLEBRACKETS in threads.c.

vermaseren commented 7 years ago

Hi Takahiro,

That is rather clear. I am working at it at the moment, but it is rather nasty. I brought the 100 down to 80 and now it hangs at a given moment. That is easier for the debugging. But I still have not seen what can have caused it. The bracketbuffer is big enough for all brackets. Hence it should hit straight on the bracket. The reads I see are all protected by a lock and a positioning. Somewhere there could be a very special case where one word is missed.

To be continued……

Jos

On 13 jan. 2017, at 20:28, Takahiro Ueda notifications@github.com wrote:

Another way to avoid the bug is: undefine WHOLEBRACKETS in threads.c.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/vermaseren/form/issues/165#issuecomment-272525689, or mute the thread https://github.com/notifications/unsubscribe-auth/AFLxEpTiBIih4QCT5KnPLH2DuICY79Toks5rR9BpgaJpZM4LjA7j.

tueda commented 7 years ago

To me, it is not clear why WHOLEBRACKETS leads to a constant AN.Frozen...

vermaseren commented 7 years ago

Hi Takahiro,

That I have not seen yet. I am sitting in gdb and in a thread that hangs and bi->start (line 1788 in threads.c) is bigger than where. This makes no sense. Although…… I think I may be onto it now I think about it. The AR.infile is used here for the brackets, and then the question is what is used for reading the contents of the brackets. I will have to see. Probably the solution lies in Generator.

To be continued.

Jos

On 13 jan. 2017, at 20:46, Takahiro Ueda notifications@github.com wrote:

To me, it is not clear why WHOLEBRACKETS leads to a constant AN.Frozen...

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/vermaseren/form/issues/165#issuecomment-272530073, or mute the thread https://github.com/notifications/unsubscribe-auth/AFLxEpVTFoQeAuWXWWMNJ4TTW5aXR_iRks5rR9SYgaJpZM4LjA7j.

tueda commented 7 years ago

Maybe related, maybe a different bug, but if I change the number of terms as

L F = <f(1)>+...+<f(79)>;
multiply <g(1)>+...+<g(16)>;

I can get another flavor of error message

TFORM 4.1 (Dec 24 2016, v4.1-20131025-289-gfa1c759) 64-bits 4 workers  Run: Fri Jan 13 21:03:40 2017
    #:MaxTermSize 200
    #:ScratchSize 12800
    CF f,g;
    S n;
    L F = <f(1)>+...+<f(79)>;
    multiply <g(1)>+...+<g(16)>;
    B+ f;
    *B- f;  * <-- (1)
    ModuleOption noparallel;
    .sort

Time =       0.00 sec    Generated terms =       1264
               F         Terms in output =       1264
                         Bytes used      =      32880
    id g(n?) = F[f(n)];
    P;
    *ModuleOption noparallel;  * <-- (2)
    .end
Error while reading scratch file in GetTerm
Program terminating in thread 1 at 1.frm Line 12 --> 
vermaseren commented 7 years ago

Hi Takahiro,

I think I got it.. At least it passes the bugreports, including your last one. Problem is in threads.c in DOBRACKETS. After the GetTerm it calls Generator and because you use the contents of the brackets the position of the file in the threadspecific AR.infile is changed. Then at the end of the loop the SeekScratch gets the wrong position. Solution: Call the SeekScratch just after the GetTerm and do a SetScratch where the SeekScratch used to be. I will put it in the github.

Cheers

Jos

On 13 jan. 2017, at 21:04, Takahiro Ueda notifications@github.com wrote:

:MaxTermSize 200

#:ScratchSize 12800
CF f,g;
S n;
L F = <f(1)>+...+<f(79)>;
multiply <g(1)>+...+<g(16)>;
B+ f;
*B- f;  * <-- (1)
ModuleOption noparallel;
.sort

Time = 0.00 sec Generated terms = 1264 F Terms in output = 1264 Bytes used = 32880 id g(n?) = F[f(n)]; P; ModuleOption noparallel; <-- (2) .end

tueda commented 7 years ago

Thanks! The new version (769f7b5cf23483e5bf892cd73c0c20a6772cbd5e) works also for the 1GB example. (But #162 is not fixed yet.)

vermaseren commented 7 years ago

Hi

Sorry, wrong number. I’ll have a look at that then.

Jos

On 13 jan. 2017, at 21:31, Takahiro Ueda notifications@github.com wrote:

Thanks! The new version (769f7b5 https://github.com/vermaseren/form/commit/769f7b5cf23483e5bf892cd73c0c20a6772cbd5e) works also for the 1GB example. (But #162 https://github.com/vermaseren/form/issues/162 is not fixed yet.)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/vermaseren/form/issues/165#issuecomment-272540358, or mute the thread https://github.com/notifications/unsubscribe-auth/AFLxEncWVQ2Z8XasdkG83PwzmkKqGQp_ks5rR98PgaJpZM4LjA7j.