Closed GoogleCodeExporter closed 8 years ago
argh, strlen-2 testacse is actually built from three files ... the following
stems
from a permutation of optimization switches on all three source files:
1. it does not really matter, what optimization switches are used for lib/main.c
2. if strlen-2.c is compiled with -O0 and strlen-2-lib.c with -O1, the thing
aborts.
3. if both strlen-2.c and strlen-2-lib.c are compiled with -O0, the thing
segfaults.
4. if strlen-2.c is compiled with -O1 and strlen-2-lib.c is compiled with -O0,
the
thing runs ok.
Original comment by jmoc...@gmail.com
on 20 Oct 2008 at 1:01
case (4) is OK and even the abort in (2) is actually OK (strlen-2.c calls
non-built-in strlen() which should abort in this case.
however, (3) is just plain strange. I walked the main_test() func with dbx on
SX8 and
it turns out that segfault is done in strlen(), where we try to load something
in
memory just a few steps in the func:
ld1b $s35,0(0,$s35)
where $s35 = 0x000000007d1b9b40
this is a pointless address as our memory layout is afaik:
text/data >= 0x0000000400000000
stack >= 0x0000000800000000
(NOTE: in -O0 case, this is the strlen() we provide in strlen-2-lib.c)
the strange thing is, this is the first (of two?!) parameters to our strlen()
call;
dbx shows the following stack trace:
(dbx) where
* 0 strlen(0x7d1b9b40, 0x86400000) at 0x400001400
1 main_test() at 0x400000cf4
2 main(0x1, 0x8000000150, 0x8000000160) at 0x4000016bc
3 _start(0x1, 0x8000000150, 0x8000000160) at 0x4000005a0
now, I notice that main() has three params as well and it seems that the first
two
are "real" params, argc == 0x1 after all.
anyhow, the value of the first strlen() param is strange.
I'm especially puzzled by the following (based on assembly source and dbg
tracing):
1. prior to strlen() call, we put the only argument in memory twice:
first time via $s1 (hard frame ptr):
sts $s36,320(,$s1)
this one stores at 0x8...1f70
and immediately afterwards once again via $s2 (stack ptr):
lds $s37,320(,$s1)
sts $s37,-296(,$s2)
this one stores at 0x8...1f68
the destination addresses for both sts instructions are exactly 8 bytes apart.
then, we load strlen() address to $s35.
now, we put 0x2 in $s36 and store it via stack ptr:
sts $36,-336(,$s2)
the destination address is 0x8...1f40
this address is lea-ed into $s34:
lea $s34,-336(,$s2)
NOTE: here lies, imho, the root of all problems. $s34 should be arg ptr, right?
well,
args begin at 08x...1f70 (or 1f68, I don't know why it's put on stack twice ...)
finally, we copy $s35 to $s33 and bsic to strlen():
bsic $s32,($s33)
in strlen(), 8 regs are stored from 0x8...1f80 onwards. $s1 is set to point to
this
address. $s2 points to 0x8...21c0.
after initial check (should we abort()), strlen() goes on to
lds $s35,280(,$s1)
which loads from 0x8...2034 (pointless, surely our params are not there, and
indeed,
$s35 contains 0 after this)
then we seem to want to pop the argument
lds $s36,8(0,$s3)
but since arg ptr is WRONG (look above), this fetches thing from 0x8...1f48 and
there
we have 0x000000007d1b9b40 in $s36, which is clearly an invalid address (also
the
string we passed as param is at 0x4...a90)
so the code created by gcc to call strlen() is WRONG. erich?
Original comment by jmoc...@gmail.com
on 20 Oct 2008 at 4:14
First of all, this is the second if block in the main_test. The first one is
optimized away. You can see this with -fdump-tree-all.
From the end of your report: lea $s34,-336(,$s2) seems ok, that is the point
where
the arguments (first the number of arguments) are supposed to be. The prologue
code
confirms that $s2 is set right. The number of arguments seems wrong. When
compiling
with -dP in the assembler you see that the number of bytes passed are 16 (in the
const_int), so converting that to 2 arguments is right. But 16 bytes is wrong
in the
first place. I tried with only calling strlen() and passing it a char*, the
number of
arguments was set to 2, again(!). Maybe that is one of the first places to look
for.
The reason why the pointer is placed in a completely wrong place, I don't know.
One
could search for reasons why the wrong args are passed (some macros in sx.h
decide
about how and how many arguments are passed). Also it is possible that the
call_value
insn or expansion are somehow broken or too limited. Look into contrib/b for a
completely different approach (that one lead to more errors).
Original comment by efo...@gmail.com
on 20 Oct 2008 at 10:58
argh. still no luck. however, it seems the problem is not related to overriding
builtin strlen (just tried it with mystrlen() in its stead). so it must be in
the
complex expression passed as the parameter to strlen (and/or faulty
optimization of
always-false-conditions). if not, then it must be an act of god.
Original comment by jmoc...@gmail.com
on 21 Oct 2008 at 11:32
btw, it's the third if in strlen-2.c; both first and second are optimized away
even
with -O0, which is just plain strange to me - how dares gcc speculate about the
return value of strlen(), which it needs to if it wants to optimize those two
ifs away?
Original comment by jmoc...@gmail.com
on 21 Oct 2008 at 11:51
ok. I think I got to the minimal testcase. the problem seems to be triggered by
using
the ternary if-then-else operator ?: in place of a function argument. compile
the
attached testcase to assembly with -O0, and inspect it: arg count (1) for
blah() call
is stored -304 off SP ($s2), while the only arg is stored at -280 off SP.
there's
"whatever's in memory" in between and blah() in turn refers to that (8 off the
arg
ptr ($s3), which is -304 off old SP.
the other conditionally compiled branch (not enabled by default) of the testcase
compiles OK, as the argument is calculated separately using ?: and stored in a
temp var.
Original comment by jmoc...@gmail.com
on 21 Oct 2008 at 2:12
Attachments:
Very interesting! This is exactly the case with the still failing cases in bug
#34.
Probably will set that bug as duplicate now...
Original comment by efo...@gmail.com
on 21 Oct 2008 at 2:27
this is a rather wild speculation, but as we consistently put the argument on
the
stack 16 bytes off, I'd say that we somehow think that the ?: construct takes 24
bytes (3x 8bytes = 1x size_t + 2x ptr). perhaps a type-related problem?
Original comment by jmoc...@gmail.com
on 21 Oct 2008 at 2:40
This happens also if you call a function with integer arguments and have inside
the
arguments the ?: . Look at the -fdump-tree-all (last printed tree dump) and you
will
see that the internal code generated looks very much like the thing that's
working
for you. I.e. there are intermediate variables.
Can you check what you really have in the missing place? maybe the intermediate
variables are placed there? From -fdump-rtl-all I can't say things look wrong.
The
arguments are virtual-outgoing-args until the step 110r.vregs (instantiate
virtual
registers), and their datatypes look reasonably (DI). This does not look as if
the
data stored as argument is wrong. It rather looks like the
virtual-outgoing-args are
pointing to the wrong place, as if someone else has lifted the arg pointer or
allocated a stack slot (or two) in the wrong place.
Original comment by efo...@gmail.com
on 21 Oct 2008 at 3:01
there is *nothing* in the missing place (between arg at wrong address and arg
count).
if you read the assembler, you can see that nothing is ever put there by the
caller.
callee accesses this place thinking that the arg is there (it should be, but
isn't).
Original comment by jmoc...@gmail.com
on 21 Oct 2008 at 3:17
Fancy... in the code I produce I see that the intermediate result of the ?: is
stored
at 288(,$s1). This is two slots above 272(,$s1). As maximum number of arguments
is 1
(blah()), and one slot is reserved for the number of arguments, the place where
the
?: result is stored is the first available slot for automatic variables. So
this part
has worked correctly.
On the other hand, when calling the function blah() arguments are stored
relative to
the stack pointer $s2. The only argument is stored at -280(,$s2) which is
actually
equal to 296(,$s1). The correct place would be 280(,$s1)...
The decision for the place to store the arguments is done very early (.vreg
step),
while the place for storing the local variable is really there only after .greg.
Maybe this means we need to delay storing the arguments until the other
registers are
done? In other implementations there are several places where such delays need
to be
done...
Original comment by efo...@gmail.com
on 21 Oct 2008 at 3:39
grrrrrr ... I'm getting really annoyed by this. a fix, a fix, a kingdom for a
fix!
the main difference between -O0 and -O1 (latter works OK) is in the fact that
-O1
code does not put intermediate value resulting from ?: operator (iftmp0 in rtl)
on
the stack.
-O0 (fails miserably):
1. ?:
2. sts result,288(,$s1) (one or the other result)
3. lds reg,288(,$s1) (reloads it result of ?:)
4. sts reg,-280(,$s2)
5. sts argcount,-304(,$s2)
6. lea argptr,-304(,$s2)
7. bsic
-O1 (works like a charm):
1. ?: (result is in reg)
2. sts reg,-280(,$s2)
3. sts argcount,-288(,$s2)
4. lea argptr,-288(,$s2)
5. bsic
thus, it seems that it's the placing of tmp result of expression used for arg
on the
stack that confuses gcc's idea of what is where on the stack.
btw, -fverbose-asm is your friend ... ;)
Original comment by jmoc...@gmail.com
on 22 Oct 2008 at 2:33
one more thing: I was wrong about nothing being put between arg count and the
only
arg (see comment 10); actually, the stack looks like this with -O0 just before
doing
the bsic to blah():
SP ($s2) -------------> callee RSA (272b)
...
SP - 280 -------------> arg (8b, == iftmp0 == result of ?!)
SP - 288 -------------> iftmp0 (8b, result of ?:)
SP - 296 -------------> ???? (8b, nothing stored here)
ARGPTR = SP - 304 ----> arg count (8b, == 1 in our case)
caller RSA (272b)
...
1. argcount and argptr are assigned just after caller RSA at SP-304: correct, as
outgoing args should be just after caller RSA, before the arguments
2. iftmp0, a (auto) local, comes after args space at SP-288: correct.
3. the only arg should be put at SP-296 (empty slot) after argcount in the args
space, however, gcc somehow decides that it should go at SP-280.
4. there is also 8 bytes too much on the stack, the only (auto) local is
iftmp0, we
have 16 bytes allocated for locals.
looking at rtls is beginning to give me a slight headache. I expect it to
progress as
the evening draws nearer ... ;)
Original comment by jmoc...@gmail.com
on 22 Oct 2008 at 3:46
moving args after locals helps: now the thing works with all optimization
levels,
including optimizing for size.
however, we still leave one slot unused, according to my reverse engineering of
stack
from assembly for s.c:
SP ($s2) -->
RSA (callee, 272b)
SP - 280 -->
arg (8b)
SP - 288 -->
argcount (8b)
???????? (8b, wtf does this come from?!)
SP - 304 -->
local0 (8b, iftmp0)
RSA (current, 272b)
HFP ($s1) ->
being a pedantic sob, I want this empty slot eliminated: the fight is not over
yet! ;)
Original comment by jmoc...@gmail.com
on 22 Oct 2008 at 6:05
empty space is due to making frame size a multiple of 16 bytes (see
sx_compute_frame_size()).
Original comment by jmoc...@gmail.com
on 23 Oct 2008 at 8:25
Original issue reported on code.google.com by
jmoc...@gmail.com
on 20 Oct 2008 at 12:52