Closed GoogleCodeExporter closed 9 years ago
I'm sorry with all the problems you've been having on solaris! We do have a
solaris box that we test on, and have successfully gotten everything to compile
(see the INSTALL file for more details), but we don't test on solaris
regularly, so there may have been some regressions.
} src/base/sysinfo.cc:135:14: error: ‘SYS_open’ was not declared in this
scope
I'm confused by this, because this code dates from early 2007, and I *know*
we've successfully built and run on solaris x86 since then. :-) The machine is
an x86_64 machine as well. Unfortunately, our solaris machine is down at the
moment, so I can't test why this successfully compiles for us. I noticed that
later in your report you have:
} Target: i386-pc-solaris2.11
Is this possibly an i386 vs x86_64 issue?
} ./configure --enable-frame-pointers
Yes, I'm not totally surprised that is required.
} libtool: link: link -dump -symbols .libs/profiler.o .libs/profile-handler.o
.libs/profiledata.o ./.libs/libstacktrace.a | | /usr/gnu/bin/sed 's/.* //' |
sort | uniq > .libs/libprofiler.exp
This is a libtool issue; nothing that is under our control. Almost certainly,
between those two pipe symbols is supposed to be a command. You can see by
looking at libtool line 956. libtool is produced automatically at
configure-time, and is different for every system, so I can't look at this
myself, but almost certainly the command is this one:
export_symbols_cmds="\$NM \$libobjs \$convenience | \$global_symbol_pipe |
\$SED 's/.* //' | sort | uniq > \$export_symbols"
In my libtool, I have
global_symbol_pipe="sed -n -e 's/^.*[ ]\\([ABCDGIRSTW][ABCDGIRSTW]*\\)[ ][ ]*\\([_A-Za-z][_A-Za-z0-9]*\\)\$/\\1 \\2 \\2/p'"
but you (presumably) have something different, probably the empty string. I'm
not sure why -- you may need to look into ltmain.sh to figure it out -- but
hopefully when you ran 'configure' it gave a big error message that may be a
clue.
Original comment by csilv...@gmail.com
on 30 Nov 2011 at 4:56
About syscall: no, it isn't i386 vs x86_64 issue. That "i386 target" simply
means that gcc executable is 32-bit, like many of things in solaris; however
it's fully able to compile 64-bit code without problems.
There is no such syscall, simply as is:
$ grep SYS_open /usr/include/sys/syscall.h
#define SYS_openat 68
#define SYS_openat64 69
SYS_open isn't defined on modern solaris on x86_64 systems. There is
information in internet that it's defined on sparc only by now, though I can't
check that.
open() function on solaris uses openat syscall with AT_FDCWD argument. So
perftools needs to syscall directly, some code like this is needed:
#ifdef SYS_open
# define safeopen(filename, mode) syscall(SYS_open, filename, mode)
#else
# define safeopen(filename, mode) syscall(SYS_openat, AT_FDCWD, filename, mode)
#endif
instead of old line. fcntl.h that defines AT_FDCWD is already included at this
point.
libtool is old on solaris and gives various problems, that's a known fact :-/ I
had to work around its issues before, too. It's "ancient" by GNU standards
$ libtool --version
ltmain.sh (GNU libtool) 1.5.22 (1.1220.2.365 2005/12/18 22:14:06)
I believe it has been patched to support solaris somehow, though, and works
better than if I just try to replace it with latest GNU libtool.
global_symbol_pipe is empty in generated libtool script here.
As about nm, it's more complicated issue.
You see, solaris includes software and libraries compiled with both Sun Studio
and gcc. It also includes two set of tools for analyzing binaries: native ones
(SGU) and gnu ones. For example, there is /usr/bin/nm (solaris nm) and
/usr/gnu/bin/nm (GNU binutils). GNU nm doesn't support analyzing some of
solaris libraries and binaries which were compiled with Sun Studio, saying
"File format not recognized"; and SGU nm doesn't support analyzing some of
gcc-generated code. (though sometimes they both work fine - I don't really know
what's going on here)
Solaris nm outputs information in very different format than GNU nm, with
tables in pseudographics and such.
Now, returning to configure. It has code like:
...
if test "$lt_cv_path_NM" != "no"; then
NM="$lt_cv_path_NM"
else
# Didn't find any BSD compatible name lister, look for dumpbin.
if test -n "$ac_tool_prefix"; then
for ac_prog in "dumpbin -symbols" "link -dump -symbols"
do
...
If I run configure with gnu environment in path, then nm is GNU nm and link is
coreutils "ln" equivalent
$ link --version
link (GNU coreutils) 8.5
$ nm --version
GNU nm (GNU Binutils) 2.19
It makes configure execute like
checking for BSD- or MS-compatible name lister (nm)... no
checking for x86_64-pc-solaris2.11-dumpbin... no
checking for x86_64-pc-solaris2.11-link... no
checking for dumpbin... no
checking for link... link -dump -symbols
checking the name lister (link -dump -symbols) interface... BSD nm
...
checking for ranlib... ranlib
checking command to parse link -dump -symbols output from gcc object... failed
That's what happens:
configure:6447: found /usr/gnu/bin/ranlib
configure:6458: result: ranlib
configure:6548: checking command to parse link -dump -symbols output from gcc
object
configure:6666: gcc -c -march=core2 -mtune=core2 -msse4.1 -m64 -O2 -pipe -g
-I/home/mosgalin/Software/include conftest.c >&5
configure:6669: $? = 0
configure:6673: link -dump -symbols conftest.o \| sed -n -e 's/^.*[
]\([BDRT][BDRT]*\)[ ][ ]*\([_A-Za-z][_A-Za-z0-9]*\)$/\1 \2 \2/p' \>
conftest.nm
link: invalid option -- 'd'
Try `link --help' for more information.
...
configure:6768: result: failed
It decides it doesn't like GNU nm that's in path and goes for link, which is
GNU link. Or maybe it isn't - other two "link" in solaris 11, /usr/sbin/link
and /usr/xpg4/bin/link don't support any arguments like "-dump" and are
simplified ln clones, just like GNU link.
Now, recalling original configure code, I can make it look for different nm,
since it didn't like GNU one and looks for "BSD- or MS- compatible":
NM=/usr/bin/nm ./configure --host=x86_64-pc-solaris2.11 --enable-frame-pointers
This results in following configure lines:
checking for BSD- or MS-compatible name lister (nm)... /usr/bin/nm
checking the name lister (/usr/bin/nm) interface... BSD nm
checking whether ln -s works... yes
checking the maximum length of command line arguments... 786240
...
checking for ranlib... ranlib
checking command to parse /usr/bin/nm output from gcc object... failed
config.log output:
...
configure:6548: checking command to parse /usr/bin/nm output from gcc object
configure:6666: gcc -c -march=core2 -mtune=core2 -msse4.1 -m64 -O2 -pipe -g
-I/home/mosgalin/Software/include conftest.c >&5
configure:6669: $? = 0
configure:6673: /usr/bin/nm conftest.o \| sed -n -e 's/^.*[
]\([BDRT][BDRT]*\)[ ][ ]*\([_A-Za-z][_A-Za-z0-9]*\)$/\1 \2 \2/p' \>
conftest.nm
configure:6676: $? = 0
cannot run sed -n -e 's/^.*[ ]\([BDRT][BDRT]*\)[ ][
]*\([_A-Za-z][_A-Za-z0-9]*\)$/\1 \2 \2/p'
I don't know why such regexp is used on solaris. I'm really at loss at all this
autoconf stuff. /usr/bin/nm output looks like:
conftest.o:
[Index] Value Size Type Bind Other Shndx
Name
[4] | 0| 0|SECT |LOCL |0 |4 |
[7] | 0| 0|SECT |LOCL |0 |7 |
[9] | 0| 0|SECT |LOCL |0 |9 |
[10] | 0| 0|SECT |LOCL |0 |10 |
[11] | 0| 0|SECT |LOCL |0 |11 |
[2] | 0| 0|SECT |LOCL |0 |2 |
[3] | 0| 0|SECT |LOCL |0 |3 |
[8] | 0| 0|SECT |LOCL |0 |8 |
[5] | 0| 0|SECT |LOCL |0 |5 |
[6] | 0| 0|SECT |LOCL |0 |6 |
[1] | 0| 0|FILE |LOCL |0 |ABS
|conftest.c
[13] | 16| 10|FUNC |GLOB |0 |2
|main
[12] | 0| 2|FUNC |GLOB |0 |2
|nm_test_func
[14] | 1| 1|OBJT |GLOB |0 |COMMON
|nm_test_var
I tried re-creating configure with solaris version of GNU autoconf (2.63), but
it didn't change anything.
Original comment by Vladimir...@gmail.com
on 1 Dec 2011 at 3:40
} #ifdef SYS_open
} # define safeopen(filename, mode) syscall(SYS_open, filename, mode)
} #else
} # define safeopen(filename, mode) syscall(SYS_openat, AT_FDCWD, filename,
mode)
} #endif
This makes sense to me as a fix. I'm just still confused about why it works on
my solaris box. I'd like to dig into this a bit deeper when my solaris box is
up and running again (hopefully in the next few weeks -- we're moving it to a
new colo so things are slower than I'd like). I'll keep this bug open in the
meantime.
As for configure, it looks like the script is doing this:
case `"$tmp_nm" -B /dev/null 2>&1 | sed '1q'` in
*/dev/null* | *'Invalid file or object type'*)
lt_cv_path_NM="$tmp_nm -B"
break
;;
*)
case `"$tmp_nm" -p /dev/null 2>&1 | sed '1q'` in
*/dev/null*)
lt_cv_path_NM="$tmp_nm -p"
break
;;
So you need to see if the nm on your path supports -B or -p. If it does, then
why isn't configure picking it up? You might try running sh -x configure to
get more insight into what's going on. Or config.log may have useful info.
It also looks like you can set NM manually if you have one you know is
GNU-style:
env NM=/whatever/nm ./configure
and see if that works.
Original comment by csilv...@gmail.com
on 1 Dec 2011 at 5:29
Thanks, I was able to get this fixed. The source of this problem was that
configure redefines default PATH. So having GNU nm in PATH before other nm
didn't help, because configure found SGU nm (/usr/bin/nm) and used it
automatically. Specifying NM=/usr/gnu/bin/nm fixes it - I didn't think of that
before because, well, it was default nm from PATH already.
Maybe it's possible to add to configure.ac or somewhere else that configure
looks for GNU-style apps in /usr/gnu/bin on solaris, before looking in
/usr/bin? In Solaris 11, GNU utilities became part of distribution and are
installed by default, most go into /usr/bin but ones with name conflicting to
classic Solaris utilities have "g" appended to name (gnm here, for example),
and original name is installed into /usr/gnu/bin. Looks like configure would
have much less problems if it just uses GNU utilities mostly.
Original comment by Vladimir...@gmail.com
on 2 Dec 2011 at 12:01
Ah, and I didn't see this problem because I don't have the solaris tools
installed.
I don't know solaris at all, but if you want to write up a patch to
configure.ac or whatever that does the right thing, I'm happy to take a look.
I take it that this doesn't resolve the issues you saw with SYS_open vs openat?
I will still plan to look at that when the solaris box is back online.
Original comment by csilv...@gmail.com
on 2 Dec 2011 at 7:18
Yes, SYS_openat is a must and change like above is required.
As for nm invocation, looks like it's a problem on my side. I used ./configure
--host=x86_64-pc-solaris2.11 - and that host argument was causing configure not
to look for nm, only x86_64-pc-solaris2.11-nm. It falls back on every other
utility to plain name but for some reason, not for nm. However, when not using
--host argument, /usr/gnu/bin/nm is correctly detected and used - and
everything works fine.
Strictly speaking, that --host line is not needed for google-perftools; i'm not
cross-compiling or anything anyway, it's more of a workaround to build 64-bit
software - in some software, configure script reacts on "x86_64" part and
change settings accordingly when compiling 64-bit version; without it, some
arguments might become messed up during linking or something (I forgot what
exactly, but I've seen this a few times). I always use it when compiling 64-bit
software on solaris, but it's more of a habit, as not every application needs
it. With google-perftools, it's first time this argument gave trouble, but I
guess it's not a real problem so everything can be left as is.
Original comment by Vladimir...@gmail.com
on 2 Dec 2011 at 8:42
OK, great! I'll just look into the SYS_open part then.
Original comment by csilv...@gmail.com
on 2 Dec 2011 at 9:12
Ugh, I made myself a solaris (well openindiana) system and totally forgot to
look at this before making a new release! I'll see what I can do for the next
perftools release.
Original comment by csilv...@gmail.com
on 23 Dec 2011 at 12:51
What I can say is that I compiled perftools successfully on my openindiana
instance before releasing. It looks like it got SYS_open from
/usr/include/sys/syscall.h, the beginning of which is attached below.
Original comment by csilv...@gmail.com
on 23 Dec 2011 at 1:05
Attachments:
Here is syscall.h from Solaris 11 (build 175), package
pkg:/system/header@0.5.11-0.175.0.0.0.2.1:
Original comment by Vladimir...@gmail.com
on 23 Dec 2011 at 10:30
Attachments:
Interesting -- they just leave out the number 5. What happens if you manually
add a
#define SYS_open 5
in the code? Does it happen to work, or just fall over?
Original comment by csilv...@gmail.com
on 23 Dec 2011 at 7:46
Well, it compiles with such define, but how do I check if it actually works?
I'm not sure I'm using the part of perftools that actually calls that code.
However, I noticed something strange: with new perftools 1.9.1 + that define
when launching program linked with tcmalloc very often (but not always!) I get
error:
src/thread_cache.cc:90] uname failed assuming no TLS support (errno) 0
It never happened with 1.8.3
Anyhow returning to the SYS_open issue, here is pretty detailed explanation of
this issue in Solaris 11:
http://dtrace.org/blogs/brendan/2011/11/09/solaris-11-dtrace-syscall-provider-ch
anges/
According to it, SYS_open syscall is completely gone and open() is emulated
with openat() syscall in libc.
Original comment by Vladimir...@gmail.com
on 1 Jan 2012 at 1:11
Ah, interesting. I see that safeopen is used in only a very restricted way.
It would be pretty easy to work around this issue one way or another.
Replacing it with a normal open() call if SYS_open isn't defined, would be fine.
The code is used all the time, though -- it's how we check if HEAPCHECK/etc are
set. If 'make check' passes, you're fine.
Original comment by csilv...@gmail.com
on 3 Jan 2012 at 5:58
make check mostly passes with such define (that line about no TLS support is
printed quite a few times, though).
There are some errors, like
--- Test failed for Allocate2: didn't account for 90% of executable memory
--- Program output:
src/thread_cache.cc:90] uname failed assuming no TLS support (errno) 0
Starting tracking the heap
google_malloc section is missing, thus InHookCaller is broken!
malloc_hook section is missing, thus InHookCaller is broken!
Hooked allocator frame not found, returning empty trace
Hooked allocator frame not found, returning empty trace
Hooked allocator frame not found, returning empty trace
(lots of lines like that)
or
Starting tracking the heap
google_malloc section is missing, thus InHookCaller is broken!
malloc_hook section is missing, thus InHookCaller is broken!
Hooked allocator frame not found, returning empty trace
Hooked allocator frame not found, returning empty trace
...
Hooked allocator frame not found, returning empty trace
Hooked allocator frame not found, returning empty trace
Check failed: regions_ != NULL:
Final result is
PASS
PASS: tcmalloc_and_profiler_unittest
======================================
6 of 43 tests failed
Please report to opensource@google.com
======================================
make[1]: *** [check-TESTS] Ошибка 1
make[1]: Leaving directory `/home/mosgalin/Software/google-perftools-1.9.1'
make: *** [check-am] Ошибка 2
I could attach output if you are interested, but it's really huge, though
compresses well.
For the record, perftools 1.8.3 without "#define SYS_open 5" hack (SYS_openat
code used instead) has about the same errors, and final output is the same, "6
of 43 tests failed". But without TLS errors during tests.
Original comment by Vladimir...@gmail.com
on 3 Jan 2012 at 9:28
Just to make sure where we are, I think the only outstanding issue is SYS_open
vs SYS_openat. I think the right fix will be to modify sysinfo.cc to just use
open() if SYS_open isn't defined. Not perfect but should be good enough. Feel
free to draw up a patch if you'd like -- it should be very straightforward.
Otherwise I will look at it when I'm back from vacation, though it may take a
while to get caught up.
You're right that many tests fail on solaris -- the heap-checker stuff, in
particular, is linux-only at the moment. So besides openat everything looks
good.
Original comment by csilv...@gmail.com
on 4 Jan 2012 at 10:20
Here is the patch:
diff -ur google-perftools-1.9.1/src/base/sysinfo.cc
google-perftools-1.9.1-new/src/base/sysinfo.cc
--- google-perftools-1.9.1/src/base/sysinfo.cc 2011-07-13 04:27:08.000000000
+0400
+++ google-perftools-1.9.1-new/src/base/sysinfo.cc 2012-01-05
04:53:53.846899935 +0400
@@ -86,7 +86,12 @@
// time, so prefer making the syscalls directly if we can.
#ifdef HAVE_SYS_SYSCALL_H
# include <sys/syscall.h>
-# define safeopen(filename, mode) syscall(SYS_open, filename, mode)
+// Workaround for Solaris 11, where SYS_open syscall is not defined.
+# ifdef SYS_open
+# define safeopen(filename, mode) syscall(SYS_open, filename, mode)
+# else
+# define safeopen(filename, mode) open(filename, mode)
+# endif
# define saferead(fd, buffer, size) syscall(SYS_read, fd, buffer, size)
# define safeclose(fd) syscall(SYS_close, fd)
#else
While you say that it's the only issue, that error line that appears in 1.9.1
("src/thread_cache.cc:90] uname failed assuming no TLS support (errno) 0") for
tcmalloc-linked applications is quite annoying; can something be done about it?
I don't know, at very least, maybe left square bracket should be added :)
Have a good vacation!
Original comment by Vladimir...@gmail.com
on 5 Jan 2012 at 1:00
OK, I've started on a patch, similar to the one you have above. I'm just
getting it reviewed locally and then will put it out on SVN.
I also figured out the problem with the TLS error (solaris() can return
positive numbers from a uname() call, rather than just 0 like linux does), and
have a fix out for that as well. If you want to quiet it in the meantime,
change the uname() call in thread_cache.cc to this:
if (uname(&buf) == -1) { // should be impossible
Original comment by csilv...@gmail.com
on 13 Jan 2012 at 6:31
This should be fixed in perftools 1.10, just released.
Original comment by csilv...@gmail.com
on 31 Jan 2012 at 7:18
Yes, confirmed, both issues are fixed. Thanks!
Original comment by Vladimir...@gmail.com
on 3 Feb 2012 at 10:27
Original issue reported on code.google.com by
Vladimir...@gmail.com
on 30 Nov 2011 at 2:04