Closed williamstein closed 14 years ago
We've had this sort of problem (many times) before...
The first obvious thing to try is to rebuild with optimization off.
Then try to fix and simultaneously email Steve Linton.
Upstream: Reported upstream. Little or no feedback.
I tried building without -O2, and the problem vanishes. I did this by brutally deleting -O2 everywhere in the src tree (!). Then I built and now suddenly GAP works:
[wstein@lena sage-4.4]$ ./sage
----------------------------------------------------------------------
| Sage Version 4.4, Release Date: 2010-04-24 |
| Type notebook() for the GUI, and license() for information. |
----------------------------------------------------------------------
sage: gap('2+2')
4
The issue here seems to be that gap is returning a StructInitInfo*
from SyFindOrLinkGapRootFile
inside a char buffer. In the process of extracting the pointer again in READ_GAP_ROOT
, gap might be breaking strict aliasing rules.
I think that ideally this method of returning values should be removed entirely (since it seems to serve no purpose), and SyFindOrLinkGapRootFile
should just get an extra argument of type StructInitInfo**
for returning this pointer.
Incidentally gcc 4.5.0 also complains about a similar issue in SaveDouble
, LoadDouble
. Although that seems to be currently working, using a union there might be safer than casting directly.
Building on Dima's new p2 from #8774, I created a p3 that avoids the strict aliasing problems by cleaning up some of the code. I also un-pre-applied saveload.patch in src/, and cleaned up a number of files in src/tst/ (including the large src/tst/GAP\ 4\ PPC
) which were somehow copied from src/.
New spkg at:
http://www.math.leidenuniv.nl/~wpalenst/sage/gap-4.4.12.p3.spkg
Replying to @wjp:
Building on Dima's new p2 from #8774, I created a p3 that avoids the strict aliasing problems by cleaning up some of the code. I also un-pre-applied saveload.patch in src/, and cleaned up a number of files in src/tst/ (including the large
src/tst/GAP\ 4\ PPC
) which were somehow copied from src/.New spkg at:
http://www.math.leidenuniv.nl/~wpalenst/sage/gap-4.4.12.p3.spkg
this needs to be tested on every platform Sage supports, (itaniums, macs, sparcs...).
This is quite a drastic change, and there is no telling how older versions of gcc would cope! I think it should get seen by real GAP kernel folks before we start even think of shipping this.
Should I post this on GAP-support, of which I happen to be a member?
Finally, I do not understand why it is a blocker. Is 4.4.1 going to support gcc-4.5.0 ?! I would say that attempting to support gcc-***.0 is a waste of time. Let it ripe to at least .1 or better .2... I therefore lower the priority of this to major.
I would also propose another way to get it in Sage: make these new patches conditional on gcc version >=4.5.
Dima
Replying to @dimpase:
this needs to be tested on every platform Sage supports, (itaniums, macs, sparcs...).
This is quite a drastic change, and there is no telling how older versions of gcc would cope!
Not so drastic, really. The changes are very much localised, and the new code is far less risky than the current version. Older versions of gcc should have no problem with it.I agree it needs testing on more platforms, of course.
I think it should get seen by real GAP kernel folks before we start even think of shipping this.
Should I post this on GAP-support, of which I happen to be a member?
If you could, that would be great. Thanks.
I would also propose another way to get it in Sage: make these new patches conditional on gcc version >=4.5.
In my opinion writing such a check would cause more potential for trouble than the changes.
-Willem Jan
P.S. The patch I added to this spkg likely doesn't satisfy gap's formatting/documentation/code conventions. It's intended as the least invasive way to fix the immediate problem, rather than as a way to clean up the offending code properly.
Replying to @wjp: [...]
I would also propose another way to get it in Sage: make these new patches conditional on gcc version >=4.5.
In my opinion writing such a check would cause more potential for trouble than the changes.
Hoezo, Willem Jan? :-)
You don't need to do #if 's in the code... Get gcc version in spkg-install, and only apply new patches in case you get right version. And print a warning that it's a new, not quite tested, stuff. A couple lines of shell code...
Dima
-Willem Jan
Replying to @dimpase:
Replying to @wjp:
Building on Dima's new p2 from #8774, I created a p3 that avoids the strict aliasing problems by cleaning up some of the code. I also un-pre-applied saveload.patch in src/, and cleaned up a number of files in src/tst/ (including the large
src/tst/GAP\ 4\ PPC
) which were somehow copied from src/.New spkg at:
http://www.math.leidenuniv.nl/~wpalenst/sage/gap-4.4.12.p3.spkg
this needs to be tested on every platform Sage supports, (itaniums, macs, sparcs...).
This is quite a drastic change, and there is no telling how older versions of gcc would cope! I think it should get seen by real GAP kernel folks before we start even think of shipping this.
Should I post this on GAP-support, of which I happen to be a member?
Finally, I do not understand why it is a blocker. Is 4.4.1 going to support gcc-4.5.0 ?!
Yes.
I would say that attempting to support gcc-***.0 is a waste of time.
Not if it's what our clients want.
Replying to @dimpase:
Replying to @wjp: [...]
I would also propose another way to get it in Sage: make these new patches conditional on gcc version >=4.5.
In my opinion writing such a check would cause more potential for trouble than the changes.
You don't need to do #if 's in the code... Get gcc version in spkg-install, and only apply new patches in case you get right version. And print a warning that it's a new, not quite tested, stuff. A couple lines of shell code...
Let me rephrase it: I don't think the changes are risky enough to justify having two separate versions of the code.
Merged: 4.4.1.alpha0
From upstream, which approves:
Well done Willem. The diagnosis looks correct. This is old code and clearly the nice solution is to use a suitable union type so as to be more honest with the C compiler about what is actually being passed back and forth. Is there an easy way to pass me the proposed patch and I'll either apply it to GAP development branch or send you a preferred version if I want to do it a little differently.
SaveDouble/LoadDouble can probably also be handled correctly with unions, so long as, in
union { char bytes[sizeof(double)], double d};
one can rely on bytes including all the data of double. Does anyone know anything about this? Otherwise might be better to use void * and memcopy or something like that.
Steve
More readable: "Well done Willem. The diagnosis looks correct. This is old code and clearly the nice solution is to use a suitable union type so as to be more honest with the C compiler about what is actually being passed back and forth. Is there an easy way to pass me the proposed patch and I'll either apply it to GAP development branch or send you a preferred version if I want to do it a little differently.
SaveDouble/LoadDouble can probably also be handled correctly with unions, so long as, in
union { char bytes[sizeof(double)], double d};
one can rely on bytes including all the data of double. Does anyone know anything about this? Otherwise might be better to use void * and memcopy or something like that.
Steve"
Reviewer: Dmitrii Pasechnik
Author: Willem Jan Palenstijn
Changed reviewer from Dmitrii Pasechnik to Dmitrii Pasechnik, William Stein
Changed upstream from Reported upstream. Little or no feedback. to Reported upstream. Developers acknowledge bug.
for the record:
the saveload.patch is still needed on ia64 (Itanium). The symptom (corrupt GAP workspace names created by GAP's SaveWorkspace) reappears if I unapply it, so the problem didn't go away after the strict aliases patch.
Changed merged from 4.4.1.alpha0 to sage-4.4.1.alpha0
If you build GAP as part of sage-4.4 with GCC-4.5.0 (on lena):
Exactly the same thing works fine with Sage-4.3.3 built using GCC-4.4.x on lena:
Deleting workspaces, etc., doesn't help. The GAP spkg is unchanged between these two versions of Sage.
Upstream: Reported upstream. Developers acknowledge bug.
CC: @wjp
Component: build
Author: Willem Jan Palenstijn
Reviewer: Dmitrii Pasechnik, William Stein
Merged: sage-4.4.1.alpha0
Issue created by migration from https://trac.sagemath.org/ticket/8773