hughperman / pure-lang

Automatically exported from code.google.com/p/pure-lang
0 stars 0 forks source link

shared pure segfaults on startup (0.31 on FreeBSD 7.2 i386 with LLVM 2.5 #14

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
1. LDFLAGS=-L/usr/local/lib ./configure --with-libiconv-prefix=/usr/local
--enable-debug --prefix=/opt
2. gmake all
3. LD_LIBRARY_PATH=. PURELIB=./lib ./pure

What is the expected output?

Pure 0.31 (i386-unknown-freebsd7.2) ...

What do you see instead?

Assertion failed: (errorcode == 0), function Mutex, file Mutex.cpp, line 85. 

What version of the product are you using? On what operating system?

I see this with 0.27 and 0.31 (on FreeBSD RELENG_7_2); didn't test other
versions (or trunk).

Please provide any additional information below.

see attached output of a gdb session (r, bt, bt full)

static pure works as advertised

Original issue reported on code.google.com by neuhau...@sigpipe.cz on 26 Aug 2009 at 4:35

Attachments:

GoogleCodeExporter commented 8 years ago
Yes, this looks familiar. It hits a failed assertion in LLVM, so this is most 
likely
an LLVM issue. Unfortunately, I don't have FreeBSD installed so that I can't 
debug
the issue myself. Any further input which helps to resolve this issue will be 
*much*
appreciated.

Interestingly, NetBSD doesn't exhibit that behaviour, LLVM and Pure works fine 
there.

For the time being, you might be able to work around this by building a 
statically
linked version of the interpreter (configure --disable-shared). Does that work 
for you?

Which LLVM version do you use? Have you tried the 2.6 release branch or current 
trunk
of LLVM? (For the latter you need Pure 0.32, due to be released today.)

Also, what's the gcc version? (There are some gcc versions which are known to 
not
work with LLVM.)

Original comment by aggraef@gmail.com on 27 Aug 2009 at 6:24

GoogleCodeExporter commented 8 years ago
compiler is the system one:

g++ (GCC) 4.2.1 20070719  [FreeBSD]

the static build does work ok in both 0.27 and 0.31, sorry for not stressing 
that enough.

LLVM is 2.5 (built from ports, FreeBSD equivalent of a "source package"), 
haven't
tried other versions yet.

i'll post further info (config.log, etc) later today.

Original comment by neuhau...@sigpipe.cz on 27 Aug 2009 at 7:22

GoogleCodeExporter commented 8 years ago
Oops, you already mentioned that it's LLVM 2.5. So it might be worth looking at 
the
LLVM 2.6 release branch and trunk in svn and see whether one of these fix the
problem. Instructions for getting LLVM from svn can be found here:
http://llvm.org/docs/GettingStarted.html#checkout

Original comment by aggraef@gmail.com on 27 Aug 2009 at 7:22

GoogleCodeExporter commented 8 years ago
gcc 4.2.1 should be ok with LLVM.

Original comment by aggraef@gmail.com on 27 Aug 2009 at 7:32

GoogleCodeExporter commented 8 years ago
The port Makefile looks good, too. So it's not an issue with the build options 
AFAICS.

Original comment by aggraef@gmail.com on 27 Aug 2009 at 7:37

GoogleCodeExporter commented 8 years ago
the Mutex.cpp assert disappears when I run ./configure with LIBS=-lpthread

about half of the tests still dump core, though.  all with:

While deleting: ...
An asserting value handle still pointed to this value!

I'm still poking at the second problem...

Original comment by neuhau...@sigpipe.cz on 27 Aug 2009 at 11:52

GoogleCodeExporter commented 8 years ago
0.31 + 2.5 works:

LIBS=-lpthread LDFLAGS=-L/usr/local/lib ./configure --prefix=/opt
--with-libiconv-prefix=/usr/local --enable-debug && gmake all check

ends in a series of "passed" (save for the expected failure of 020)

2.6.r71086 bombs, as I wrote in #6, will try later revisions tonight.

configure should discover the need to use -lpthread; I'll try to produce
a patch to that effect, unless you beat me to it (which I'd really welcome,
haven't touched autoconf or m4 in a few years).

Original comment by neuhau...@sigpipe.cz on 27 Aug 2009 at 2:49

GoogleCodeExporter commented 8 years ago
Ok, I fixed up the linker options in r2131. Attached is an updated tarball with 
the
current svn sources. Could you please try again with this version?

Also, can you please post an execution log with those "While deleting:" 
messages so
that I can have a look at them?

Original comment by aggraef@gmail.com on 27 Aug 2009 at 2:51

Attachments:

GoogleCodeExporter commented 8 years ago
re "unless you beat me to it": looks like you did in r2131, thanks!

Original comment by neuhau...@sigpipe.cz on 27 Aug 2009 at 2:51

GoogleCodeExporter commented 8 years ago
This is very good news! :) Eddie Rucker and I have worked on that issue a bit 
some
time ago, but I would have never expected it to be a linker issue. Something is 
very
broken with this gcc version if it generates dysfunctional code instead of 
reporting
a linkage error in such a case.

Anyway, I'm happy that it works now (at least with LLVM 2.5).

About the LLVM 2.6 breakage: Could that be related to issue #9? As you're 
running on
a 32 bit system, you should try configuring LLVM >= 2.6 with --disable-pic, as
described in Pure's INSTALL file. (--enable-pic is the default since LLVM 2.6, 
which
is unfortunate because AFAICT http://llvm.org/bugs/show_bug.cgi?id=3239 is still
unfixed, which causes the LLVM JIT to generate wrong PIC code on x86-32 
systems.)

Talking about --enable/disable-pic, in the LLVM 2.5 FreeBSD port I find 
neither. This
probably means that the port won't work on x86-64. Maybe you could resolve that 
issue
with the maintainer of the port? Here are the LLVM configure options that must 
be
used on x86-32/64 systems, respectively:

x86-32: --disable-pic (default with LLVM 2.5)
x86-64: --enable-pic (default with LLVM >= 2.6)

Once http://llvm.org/bugs/show_bug.cgi?id=3239 is resolved, --disable-pic 
should work
on either system, at least that's what the LLVM developers told me.

Original comment by aggraef@gmail.com on 27 Aug 2009 at 4:22

GoogleCodeExporter commented 8 years ago
As luck has it, I can reproduce the "While deleting:" failed assertions on my 
32 bit
Linux system with LLVM 2.5 now (funny enough, 'make check work ok there, but
compiling pure-gen hits it). Will fix asap.

Original comment by aggraef@gmail.com on 27 Aug 2009 at 6:24

GoogleCodeExporter commented 8 years ago
Well, turns out that those failed assertions I saw were in the batch compiler, 
so
they're probably not related to what you see using LLVM 2.6. Anyway, I fixed 
those
now (svn r2136).

Do you still have those failed assertions on LLVM 2.6? If so, can you please 
post a
backtrace?

Original comment by aggraef@gmail.com on 28 Aug 2009 at 4:02

GoogleCodeExporter commented 8 years ago
I've just uploaded Pure 0.33, which has all the latest fixes. Please give that
version a try, thanks.

http://pure-lang.googlecode.com/files/pure-0.33.tar.gz

Original comment by aggraef@gmail.com on 28 Aug 2009 at 1:39

GoogleCodeExporter commented 8 years ago
i haven't gotten around to rebuilding llvm-2.6.r71086 without PIC yet, but 0.33
passes many more tests with the same llvm-2.6.r71086 build i reported before 
(test020
failure is expected):

roman@sachmet ~/install/pure-0.33 1003:0 > gmake check
Running tests.
prelude.pure: passed
test001.pure: passed
test002.pure: passed
test003.pure: passed
test004.pure: passed
test005.pure: passed
test006.pure: passed
test007.pure: passed
test008.pure: passed
test009.pure: passed
test010.pure: passed
test011.pure: passed
test012.pure: passed
test013.pure: passed
test014.pure: passed
test015.pure: Abort trap (core dumped)
FAILED
test016.pure: passed
test017.pure: passed
test018.pure: passed
test019.pure: passed
test020.pure: FAILED
test021.pure: passed
test022.pure: passed
test023.pure: passed
test024.pure: passed
test025.pure: Abort trap (core dumped)
FAILED
test026.pure: passed
test027.pure: passed
test028.pure: Abort trap (core dumped)
FAILED
test029.pure: passed
test030.pure: passed
test031.pure: Abort trap (core dumped)
FAILED
test032.pure: passed
test033.pure: passed
test034.pure: passed
test035.pure: passed
test036.pure: Abort trap (core dumped)
FAILED
test037.pure: passed
test038.pure: passed
test039.pure: passed
test040.pure: passed
test041.pure: Abort trap (core dumped)
FAILED
test042.pure: passed
gmake: *** [check] Error 1

Original comment by neuhau...@sigpipe.cz on 29 Aug 2009 at 7:17

Attachments:

GoogleCodeExporter commented 8 years ago
r2155 with llvm-2.7.r80431, same set of failed tests (diffs attached):

Running tests.
prelude.pure: passed
test001.pure: passed
test002.pure: passed
test003.pure: passed
test004.pure: passed
test005.pure: passed
test006.pure: passed
test007.pure: passed
test008.pure: passed
test009.pure: passed
test010.pure: passed
test011.pure: passed
test012.pure: passed
test013.pure: passed
test014.pure: passed
test015.pure: Abort trap (core dumped)
FAILED
test016.pure: passed
test017.pure: passed
test018.pure: passed
test019.pure: passed
test020.pure: FAILED
test021.pure: passed
test022.pure: passed
test023.pure: passed
test024.pure: passed
test025.pure: Abort trap (core dumped)
FAILED
test026.pure: passed
test027.pure: passed
test028.pure: Abort trap (core dumped)
FAILED
test029.pure: passed
test030.pure: passed
test031.pure: Abort trap (core dumped)
FAILED
test032.pure: passed
test033.pure: passed
test034.pure: passed
test035.pure: passed
test036.pure: Abort trap (core dumped)
FAILED
test037.pure: passed
test038.pure: passed
test039.pure: passed
test040.pure: passed
test041.pure: Abort trap (core dumped)
FAILED
test042.pure: passed
gmake: *** [check] Error 1

Original comment by neuhau...@sigpipe.cz on 30 Aug 2009 at 10:38

Attachments:

GoogleCodeExporter commented 8 years ago
these segfaults affect both dynamic and static build

Original comment by neuhau...@sigpipe.cz on 30 Aug 2009 at 11:47

GoogleCodeExporter commented 8 years ago
Yes, they're not actually segfaults, but some failed assertions in LLVM. But I 
just
can't reproduce these here. Which options did you configure LLVM with?

Original comment by aggraef@gmail.com on 30 Aug 2009 at 9:33

GoogleCodeExporter commented 8 years ago
I'm using the devel/llvm-devel port (with patches other than
files/patch-tools_clang_lib_Headers_Makefile and
files/patch-tools_clang_utils_scan-build removed; they were incorporated 
upstream). 
the port defines just --enable-optimized, and I *might* have added 
--disable-pic for
the build.  OTOH, I just realized I had leftovers from a previous 2.6 install 
in the
same location, which could break stuff as well.

i'll chime in here as soon as i have anything more conclusive.

Original comment by neuhau...@sigpipe.cz on 31 Aug 2009 at 10:54

GoogleCodeExporter commented 8 years ago
I think that --enable-expensive-checks (which is enabled by default) might be 
the
issue. Since I never have this enabled, this would explain why I don't see these
failed assertions. Will try that later today.

NB: --enable-optimized alone is not good enough for a production version of 
LLVM. I
recommend configuring LLVM with --enable-optimized --disable-assertions
--disable-expensive-checks (unless you need to debug LLVM itself), otherwise
generating LLVM IR will be *very* slow.

Original comment by aggraef@gmail.com on 31 Aug 2009 at 12:32

GoogleCodeExporter commented 8 years ago
Yes, --enable-expensive-checks was the culprit. The failed assertions seem to be
bogus, but anyway I worked around that in r2162. Can you please verify?

(I still recommend that LLVM should be configured with --enable-optimized
--disable-assertions --disable-expensive-checks to improve performance.)

Original comment by aggraef@gmail.com on 31 Aug 2009 at 9:52

GoogleCodeExporter commented 8 years ago
Ok, to the best of my knowledge the remaining issues should be fixed in r2162, 
so I'm
setting the status of this bug report to "Fixed" now. Please let me know if it 
works
for you.

Original comment by aggraef@gmail.com on 1 Sep 2009 at 6:08

GoogleCodeExporter commented 8 years ago
yes, this works, thanks!

Original comment by neuhau...@sigpipe.cz on 1 Sep 2009 at 6:58

GoogleCodeExporter commented 8 years ago
Great, thanks for helping to get these issues sorted out!

Original comment by aggraef@gmail.com on 1 Sep 2009 at 7:18

GoogleCodeExporter commented 8 years ago
Roman, please note that I had to back out r2162 again, as it caused some serious
leaks in the JIT which would eventually cause the JIT to abort when running out 
of
memory for function stubs (which could happen, e.g., when repeatedly calling 
'eval'
on expressions involving local functions). So the code I disabled in r2162 (in 
order
work around the issues with LLVM's --enable-expensive-checks) *is* needed after 
all,
to prevent more serious misbehaviour.

The only thing I can recommend right now to deal with this issue is to just 
build
LLVM with --disable-expensive-checks. --enable-expensive-checks is the real 
culprit
here, and doesn't buy you anything (just slows down the compiler) unless you 
really
need to debug LLVM.

Original comment by aggraef@gmail.com on 6 Oct 2009 at 12:33