Closed GoogleCodeExporter closed 8 years ago
I need the .diff files of the failed tests from the test directory. (The .log
files
don't help, they're just the original test logs.)
Original comment by aggraef@gmail.com
on 22 Aug 2009 at 5:44
Is that the LLVM 2.5 package from FC11, or was LLVM built manually? Which
configure
options?
Also, please supply a complete build log of Pure so that I can see which
options Pure
was compiled with. Thanks!
Original comment by aggraef@gmail.com
on 22 Aug 2009 at 5:48
BTW, does "Rawhide" refer to FC11 or FC12? I'm not familiar with the FC release
cycles/names.
Original comment by aggraef@gmail.com
on 22 Aug 2009 at 5:50
Rawhide always refers to the next development version. the release number
reported in
/etc/fedora-release is ${previous_release}.9x -- thus 11.90, 11.91 etc, only
becoming
12 when we're ready for the actual release. It's like Debian's .. err... Sid,
is it?
Oh, and throwing this in too: the test011 log just before it just stops
responding:
test011.pure: *** glibc detected *** ./pure: malloc(): smallbin double linked
list
corrupted: 0x000000000198a5a0 ***
Here's the pure build:
http://koji.fedoraproject.org/koji/taskinfo?taskID=1625828
and the corresponding LLVM build:
http://koji.fedoraproject.org/koji/buildinfo?buildID=128751
Original comment by Michael....@gmail.com
on 22 Aug 2009 at 6:18
Attachments:
I just noticed that you don't have --enable-pic in the ppc build of llvm. Isn't
that
needed there?
Original comment by aggraef@gmail.com
on 22 Aug 2009 at 7:05
Someone else mentioned something to that effect. Shouldn't it affect the pure
build,
though, if it's necessary? Is there anyway to know for sure short of testing on
a PPC
machine?
Original comment by Michael....@gmail.com
on 22 Aug 2009 at 7:14
Ok, quite obviously the test failures in test015, 024 and 025 are all caused by
strcmp returning bogus comparison results. Could you please fire up the Pure
interpreter on that system and try the following expressions:
__C::strcmp "a" "a";
__C::strcmp "a" "b";
__C::strcmp "b" "a";
Original comment by aggraef@gmail.com
on 22 Aug 2009 at 7:45
Concerning test011, could you please run that manually and send me the complete
output of the following command:
pure -v < test011.pure
Original comment by aggraef@gmail.com
on 22 Aug 2009 at 7:52
Re: Comment #6.
Frankly, I don't know whether it has an effect on ppc at all. I guess that if
Pure
links ok on that platform then LLVM should be ok, too. But a better test is if
Pure
passes all checks on that machine; then chances are good that the LLVM package
is
really in good working condition. ;-)
Original comment by aggraef@gmail.com
on 22 Aug 2009 at 7:59
Hmm yes, bad:
$ pure
Pure 0.30 (x86_64-unknown-linux-gnu) Copyright (c) 2009 by Albert Graef
This program is free software distributed under the GNU Public License
(GPL V3 or later). Type 'help copying' for details.
Loaded prelude from /usr/lib64/pure/prelude.pure.
> __C::strcmp "a" "a";
73513984
> __C::strcmp "a" "b";
73513984
> __C::strcmp "b" "a";
73513984
>
Different values each time pure is invoked, but always the same value no matter
what
you're comparing:
> __C::strcmp "b" "a";
1880927232
> __C::strcmp "a" "b";
1880927232
> __C::strcmp "a" "a";
1880927232
The Fedora package build would fail if make check fails -- eh, correction, it's
not
failing! With two tests enabled:
$ make check && echo OK || echo Not
Running tests.
prelude.pure: passed
test015.pure: FAILED
OK
PPC fails the following (after the ones failing on x86_64 were taken out):
Running tests.
prelude.pure: passed
test001.pure: FAILED
test002.pure: passed
test003.pure: passed
test004.pure: FAILED
test005.pure: FAILED
test006.pure: FAILED
test007.pure: passed
test008.pure: FAILED
test009.pure: FAILED
test010.pure: FAILED
test012.pure: FAILED
test013.pure: FAILED
test014.pure: FAILED
test016.pure: FAILED
test017.pure: FAILED
test018.pure: FAILED
test019.pure: FAILED
test020.pure: FAILED
test021.pure: FAILED
test022.pure: FAILED
test023.pure: FAILED
test026.pure: FAILED
test027.pure: FAILED
test028.pure: FAILED
test029.pure: passed
test030.pure: passed
test031.pure: FAILED
test032.pure: passed
test033.pure: passed
test034.pure: FAILED
test035.pure: FAILED
test036.pure: FAILED
test037.pure: passed
test038.pure: passed
test039.pure: passed
test040.pure: FAILED
test041.pure: FAILED
test042.pure: FAILED
I'm still updating some debugging packages (they tend to be rather hefty) so no
backtrace yet. Looks like we need PIC for PowerPC after all, I'll do that now.
Original comment by Michael....@gmail.com
on 22 Aug 2009 at 8:26
Hmm, this might be due to issues in LLVM's dynamic library interface, or there
might
be an issue with the Pure<->C marshalling in Pure's C interface.
Could you please try the following (note the -n flag, which is needed here to
suppress inclusion of the prelude):
$ pure -n
> extern int strcmp(char*,char*);
> strcmp "a" "a";
> strcmp "a" "b";
> strcmp "b" "a";
It also might be useful to look at some of the other string routines:
> extern size_t strlen(void*);
> strlen "abc";
> extern char *strstr(void*,void*);
> strstr "abc" "b";
About the ppc port: An early Pure version was reported to work on Linux/ppc, but
nobody has tested Pure there for quite some time, so there might well be some
issues
with it. If you post the log diffs here then I might try to figure out what's
going
wrong there.
Original comment by aggraef@gmail.com
on 22 Aug 2009 at 9:12
Here we are: typescript is a record of the debugging session, and pure.log is
the
build log. The fixed LLVM build was not tagged yet the last time I tried
building
pure on our build server, unfortunately, so no PPC or i686 build to check for
now (I
*could* try building on my netbook but that's a bit painful).
Here are the GCC versions used in Fedora, by the way:
$ bodhi -L gcc
dist-f10-updates-testing gcc-4.3.2-7
dist-f10-updates-candidate gcc-4.3.2-7
dist-f10-updates gcc-4.3.2-7
dist-f11-updates-candidate gcc-4.4.1-2.fc11
dist-f11-updates gcc-4.4.1-2.fc11
dist-f11-updates-testing gcc-4.4.1-2.fc11
$ rpm -q gcc # on Rawhide
gcc-4.4.1-6.x86_64
No go on the extern declaration:
> extern int strcmp(char*, char*);
> strcmp "a" "a";
-1653433344
> strcmp "a" "b";
-1653433344
> strcmp "b" "a";
-1653433344
Actually, I wonder if this is all just a Rawhide problem. Note the result of
executing test-strcmp attached below:
[michel@erdos ~]$ gcc test-strcmp.c -o test-strcmp
[michel@erdos ~]$ ./test-strcmp a a
-51
[michel@erdos ~]$ ./test-strcmp a b
-51
[michel@erdos ~]$ ./test-strcmp b a
-52
[michel@erdos ~]$ ./test-strcmp foo bar
-56
[michel@erdos ~]$ ./test-strcmp bar foo
-52
Original comment by Michael....@gmail.com
on 23 Aug 2009 at 12:10
Attachments:
ah, nevermind, brain fade. should have compared argv[1] against argv[2], which
works
as expected.
Original comment by Michael....@gmail.com
on 23 Aug 2009 at 12:18
I think that test011 also fails because the string comparisons aren't working.
So we
need to get that sorted out first. Might be something in the C interface. But
bigint
arithmetic seems to be working, which also uses the C interface. Weird.
Can you also try strlen and strstr please? (See Comment #11.)
And what's the output of this?
$ pure
> show -d __C::strcmp
Original comment by aggraef@gmail.com
on 23 Aug 2009 at 3:25
Here it is.
Original comment by Michael....@gmail.com
on 24 Aug 2009 at 3:05
Attachments:
The file seems to be garbled (it only contains the last couple of lines). Can
you
mail it to Dr.Graef at t-online.de please?
Original comment by aggraef@gmail.com
on 24 Aug 2009 at 7:29
Done
Original comment by Michael....@gmail.com
on 24 Aug 2009 at 7:45
Reviewer is noting that this problem is definitely Rawhide-only; on F-11 all
tests pass:
https://bugzilla.redhat.com/show_bug.cgi?id=488563
Original comment by Michael....@gmail.com
on 24 Aug 2009 at 9:30
Yes, I suspect an incompatibility between LLVM and Rawhide's gcc. Or it's
something
really silly. :) Will post a some further tests later.
The log of show -d __C::strcmp is perfect, so Pure's code egenrator works all
right,
thanks.
Original comment by aggraef@gmail.com
on 25 Aug 2009 at 8:37
Ok, can you please try the attached program dltest.cpp?
Compile with: g++ -o dltest dltest.cpp -ldl
Run with: ./dltest
This should print something like:
handle = 0x7fea5bfad000
sym = 0x4006a0, strcmp = 0x4006a0
testing sym: 0, -1, 1
testing strcmp: 0, -1, 1
If that fails, then we're doomed. :) But I think it will work. I'm beginning to
suspect that the issue is actually with the way Pure initializes LLVM's dynamic
loader. This works on every Linux system that I tried, but Rawhide may do
things a
bit differently, so it might actually be necessary to dlopen either libc or the
program itself.
Original comment by aggraef@gmail.com
on 25 Aug 2009 at 10:31
Attachments:
It succeeded! Ah well, porting to unusual systems is the one way of testing
implicit
assumptions. I'd love to see what change needs to be made to get it to work.
Original comment by Michael....@gmail.com
on 25 Aug 2009 at 10:47
Attachments:
Ok, attached is a current svn snapshot of Pure 0.32. Can you please check
whether
that fixes the problem?
Original comment by aggraef@gmail.com
on 25 Aug 2009 at 11:34
Attachments:
No such luck, I'm afraid:
http://koji.fedoraproject.org/koji/taskinfo?taskID=1631971
Original comment by Michael....@gmail.com
on 25 Aug 2009 at 12:05
Ok, can you please try again with the attached tarball? (This adds the
-rdynamic flag
when linking the executable and also has an error check to see whether the
executable
was actually dlopened correctly.)
Also, with the attached version installed, does it print any error message if
you
invoke the interpreter as: pure -q
Original comment by aggraef@gmail.com
on 25 Aug 2009 at 5:54
Attachments:
Here is another test that you can try. It will tell us whether it's just an
issue
with dlopen/dlsym or whether there's something wrong with the LLVM JIT on
Rawhide (I
suspect the latter).
The attached tarball contains a simple test module for Pure. Please unpack it,
then,
with Pure installed, compile and run it as follows (the stock Pure 0.31 will
do, or
any of the custom tarballs I posted above):
$ make
g++ -shared -fPIC -o testmod.so testmod.cc -lpure
$ pure -q -i testmod.pure
> test (addr "strcmp") "a" "b";
This should print something like:
s = 'a', t = 'b'
fp = 0x7f4817adf7b0, strcmp = 0x7f4817adf7b0
Function pointers are identical.
testing fp: 0, -1, 1
testing strcmp: 0, -1, 1
-1
Original comment by aggraef@gmail.com
on 25 Aug 2009 at 6:23
Attachments:
BTW, you'll have to run the above test on x86-64, otherwise (on 32 bit systems)
you
should remove the -fPIC from the compilation command in the Makefile.
Original comment by aggraef@gmail.com
on 25 Aug 2009 at 6:25
Function pointers are not identical. I actually tested it on 0.30, but I'm
assuming
the output would be similar too for 0.31 or 0.32 snapshots?
g++ -shared -fPIC -o testmod.so testmod.cc -lpure
[michel@erdos testmod]$ pure -q -i testmod.pure
> test (addr "strcmp") "a" "b";
s = 'a', t = 'b'
fp = 0x7f5ce6f1ae60, strcmp = 0x7f5ce6fc1c00
Function pointers are not identical.
testing fp: -419685376, -419685376, -419685376
testing strcmp: 0, -1, 1
-419685376
Original comment by Michael....@gmail.com
on 25 Aug 2009 at 7:10
Michel, I just committed some changes of the linkage options which seem to fix
some
long-standing linkage-related weirdness on FreeBSD. Maybe we're lucky and these
changes also help with the ppc and FC12 issues that you have. Please try the
attached
tarball, it's another snapshot of current svn (r2133).
Original comment by aggraef@gmail.com
on 27 Aug 2009 at 4:50
Attachments:
Is that fix in 0.32 final (r2135)? A bit rusty on my SVN. I'm assuming that,
since
you don't use branches, that I can take the revision of the tag to mean the
code is
identical to checking out trunk at that revision.
Original comment by Michael....@gmail.com
on 27 Aug 2009 at 9:42
No, the 0.32 release (tagged as 0.32 in svn) is some earlier revision. r2135 is
the
current trunk, which is pretty much like the r2133 I attached in my previous
post,
but with some additional changes in the Makefile.in to make the Windows/mingw
port
compile again. (These all report themselves as 0.32 right now, since I didn't
bump
the version number yet.)
So does r2133 (or r2135) work on FC12?
Original comment by aggraef@gmail.com
on 27 Aug 2009 at 10:59
Ah. Let me try that, then. By the way, the stable 0.32 release still fails the
same
tests on F-12, but now fails *all* tests on F-11 (previously, all tests pass).
I found some bugs with our LLVM packaging -- some paths that are only valid
during
the build process end up in the libraries -- so I'm fixing that first.
Original comment by Michael....@gmail.com
on 27 Aug 2009 at 11:26
2133 passes all tests on F-12 i686, but it looks like the same problem is still
affecting x86_64: the build stalls after test10, presumably because test11
never even
terminates:
http://koji.fedoraproject.org/koji/taskinfo?taskID=1639698
Original comment by Michael....@gmail.com
on 28 Aug 2009 at 12:39
> By the way, the stable 0.32 release [...]
> now fails *all* tests on F-11 (previously, all tests pass).
Hmm, that usually indicates that the interpreter segfaults right at startup,
which in
turn usually means a hosed LLVM package. Please give more details.
Original comment by aggraef@gmail.com
on 28 Aug 2009 at 1:42
> 2133 passes all tests on F-12 i686, but it looks like the same problem is
still
> affecting x86_64: the build stalls after test10, presumably because test11
never even
> terminates:
Ok, can you please leave out test011 and post the diffs for the remaining
failures?
Concerning the complete failure on FC11, what do you see when you just run pure
from
the command line?
Original comment by aggraef@gmail.com
on 28 Aug 2009 at 1:55
I've just uploaded Pure 0.33, which has all the latest fixes. Please give that
version a try, thanks.
http://pure-lang.googlecode.com/files/pure-0.33.tar.gz
Original comment by aggraef@gmail.com
on 28 Aug 2009 at 1:38
So, did you see any progress with Pure 0.33? Can I close this bug report?
Original comment by aggraef@gmail.com
on 31 Aug 2009 at 9:58
Original comment by aggraef@gmail.com
on 1 Sep 2009 at 6:02
Sorry for the delay! Just tested pure-0.33. The failures on F-11 is most likely
due
to the 32-bit PIC problem (the updated package is still in testing and has not
landed
yet).
With pure 0.33:
F-12 Koji: http://koji.fedoraproject.org/koji/taskinfo?taskID=1648278
on i686, every tests pass; on x86_64, the same four tests have to be disabled
On Fedora 11 x86_64, all tests pass.
Original comment by Michael....@gmail.com
on 1 Sep 2009 at 3:56
Attachments:
Well, I'm grasping at straws, but maybe gcc 4.4.1 needs the main program
pure.cc to
be compiled with -fPIC, too? This shouldn't be necessary, but it shouldn't hurt
either. Can you try the attached patch to Makefile.in and see whether that
helps on
FC12 x86_64? The patch has to be applied before running configure.
Other than that, I have no idea what else to try. If you have an idea, please
let me
know! To me it looks like the dynamic linker is broken in some way on FC12-64.
I did
review the FC12-64 build log carefully and it doesn't reveal anything out of the
ordinary. (Those stupid warnings about ignored return values can be safely
ignored,
I'll fix these in svn asap.)
Concerning the test failures on ppc: If and when you have the time, I'd
appreciate it
if you could post a current build log and the test logs/diffs, preferably using
Pure
0.33 and LLVM 2.5 on FC11. Maybe I can try to figure out what's going wrong
there.
Original comment by aggraef@gmail.com
on 1 Sep 2009 at 5:55
Attachments:
I can retry the PPC build, sure. Do you want to do it with or without the patch?
Original comment by Michael....@gmail.com
on 1 Sep 2009 at 10:48
And no, the patch does not help (I verified that -fPIC is passed when compiling
pure.cc). Test 11 still bombs out, likewise 15,24,25 fail as before. Logs
attached.
Original comment by Michael....@gmail.com
on 1 Sep 2009 at 11:01
Attachments:
Oh well, I didn't really expect that this one -fPIC would help. :(
There's one more thing you can try on FC12-x64: configure Pure with
--disable-shared.
That will create a statically linked interpreter which usually gets rid of pesky
dynamic linker issues.
Concerning ppc, without the patch please, it wouldn't be of much use there
anyway.
I'd really like to take a look at the logs so I can try to figure out why it's
broken
on ppc. (Pure reportedly worked there, but that was maybe 20 minor versions
ago.)
Original comment by aggraef@gmail.com
on 2 Sep 2009 at 12:55
F-11 ppc build: http://koji.fedoraproject.org/koji/taskinfo?taskID=1649032
I'll ask around if I can get console access -- or better, get you console
access --
on a Fedora ppc machine, if you want to debug this further.
The static build failed the same way as the shared build, unfortunately.
Original comment by Michael....@gmail.com
on 2 Sep 2009 at 2:13
Attachments:
The pure-static-errors.txt file is corrupt (seems that it was truncated to
100KB?),
can you send it by mail, please? But it looks like it's the same error again.
After
all we've tried, I can only conclude that either ld or ld.so is broken on that
system.
One more thing you can try, if you have direct access to an FC12-x64 system, is
to
just do a straight build of both LLVM and Pure, as described in Pure's INSTALL
file,
without all the extra options (-Wp,-D_FORTIFY_SOURCE=2 -fexceptions
-fstack-protector
--param=ssp-buffer-size=4 -m64 -mtune=generic) that I see there in the build
log,
maybe one of these is the culprit.
Original comment by aggraef@gmail.com
on 2 Sep 2009 at 8:29
> I'll ask around if I can get console access -- or better, get you console
access --
> on a Fedora ppc machine, if you want to debug this further.
Yes, that would be helpful. I need at least the test log diffs, right now I can
only
see which tests fail, but not why. :)
Original comment by aggraef@gmail.com
on 2 Sep 2009 at 8:38
Building against LLVM compiled per pure's INSTALL file results in exactly the
same
error behaviour, so as it turns out, that's not it. I'll let you know when I
get a
PPC account ready -- could you e-mail me your SSH public key, so we can get you
login
access?
Original comment by Michael....@gmail.com
on 2 Sep 2009 at 6:01
Ok, thanks to Kevin Fenzi for providing access to a FC12/ppc box so that I
could get
this sorted out. You'll need LLVM >= 2.6 (tested with pre1) and current svn of
Pure
(r2223, or Pure 0.35 when it comes out). I'm still testing LLVM 2.7 from svn
right
now, but I don't expect any problems with that as LLVM 2.6pre1 works fine
already.
Here are my findings:
- LLVM *must* be configured with --disable-expensive-checks to work around some
bogus
assertions in the LLVM code.
- Pure must be configured with --disable-fastcc, as fastcc and/or TCO doesn't
work
with LLVM on ppc right now.
With these settings, all tests pass. So ppc seems to work fine as of r2223. :)
Note that LLVM 2.5 doesn't work even with those settings (failed test020,
apparently
due to miscompiled double precision floating point constants).
--
Michel, do you think you can also arrange access to a Rawhide/x86_64 machine
for me
so that I can see what I can do about the dynamic linker issues there?
Original comment by aggraef@gmail.com
on 6 Sep 2009 at 3:25
Just to confirm that Pure works with LLVM trunk on ppc, too. I also ran a few
example
scripts from the distribution, and tested the batch compiler, all seems to be
in good
working order.
As for the dynamic linker issues on x86_64, I played around with the linker
options,
all to no avail. I can only conclude that the dynamic linker has some problem
there,
at least in conjunction with LLVM. As a workaround, I entered the strcmp()
function
into Pure's internal linker table so that it gets resolved correctly (that's in
r2226). With that it works fine now (tested with LLVM 2.5 as well as 2.7svn),
but we
should probably review this issue when FC12 has all the kinks ironed out. ;-)
Michel, I'm marking this bug as fixed now. Please let me know if you run into
any
further problems, and thanks for helping getting these resolved!
Original comment by aggraef@gmail.com
on 7 Sep 2009 at 3:06
Thank you for all the work on this, actually. Is 0.35 going to come out soon?
Otherwise I'll just incorporate the patch on top of 0.34.
Original comment by Michael....@gmail.com
on 7 Sep 2009 at 3:27
I still want to give Ryan a chance to check on OSX/ppc, just in case I have to
fix
some more things to get it working there. But if he doesn't respond, I'll
release
0.35 in a few days anyway.
Original comment by aggraef@gmail.com
on 7 Sep 2009 at 3:36
Original issue reported on code.google.com by
Michael....@gmail.com
on 22 Aug 2009 at 5:10Attachments: