Closed GoogleCodeExporter closed 9 years ago
If i put peer with option "htcp-no-clr" it won't crash.
Original comment by nuclear...@gmail.com
on 4 Sep 2009 at 12:20
ok. how have you configured the two squid/lusca installs? I haven't used HTCP
myself so I don't quite know what
is going on there.
Original comment by adrian.c...@gmail.com
on 11 Sep 2009 at 9:04
I've got HTCP finally going between some test proxies but I can't reproduce
that exact bug. Can you please
explain how your caches are setup and how to reproduce the issue?
Ta!
Original comment by adrian.c...@gmail.com
on 8 Feb 2010 at 7:16
Right. I've created a basic setup to test this stuff and it invalidates stuff
via HTCP correctly. It however doesn't
crash for me. I think I need some further information!
Original comment by adrian.c...@gmail.com
on 8 Feb 2010 at 7:36
I will try to run it today and will post configs and etc.
Original comment by nuclear...@gmail.com
on 8 Feb 2010 at 9:03
Please do. I'm trying to finish off all of the random bugs in the issue list
before I start merging in more work.
Original comment by adrian.c...@gmail.com
on 8 Feb 2010 at 9:06
Cache is enough large "Validated 2310439 Entries"
Two hosts. 10.0.1.34 and 10.0.1.35
Key strings to reproduce:
htcp_port 4827
...
cache_peer 10.0.1.35 sibling 8080 4827 htcp proxy-only htcp-no-clr
and other one 10.0.1.34
If i remove htcp-no-clr on 10.0.1.34 , and during rebuild there is incoming http
requests - 10.0.1.34 will core dump.
It is not easy to reproduce issue, and for example if i enabled debug_options
ALL,9
it will not crash. I have also very high incoming request rate and amount of
established connections (up to 8k established), not sure it can be reproduced
in lab.
Something i was able to catch:
2010/02/08 23:37:26| htcpBuildSpecifier: size 421
2010/02/08 23:37:26| htcpBuildData: hdr.length = 429
2010/02/08 23:37:26| htcpBuildData: size 429
2010/02/08 23:37:26| htcpBuildPacket: size 435
2010/02/08 23:37:26| htcpSend: 10.0.1.39/4827
2010/02/08 23:37:26| htcpQuery: key (0x816fdf0)
04E9AF5CC28042EF4D48B9F83519936A
2010/02/08 23:37:26| htcpRecv: FD 23, 20 bytes from 10.0.1.40:4827
2010/02/08 23:37:26| htcpHandle: htcpHdr.length = 20
2010/02/08 23:37:26| htcpHandle: htcpHdr.major = 0
2010/02/08 23:37:26| htcpHandle: htcpHdr.minor = 1
2010/02/08 23:37:26| htcpHandle: hsz = 16
2010/02/08 23:37:26| htcpHandle: length = 14
2010/02/08 23:37:26| htcpHandle: opcode = 1 HTCP_TST
2010/02/08 23:37:26| htcpHandle: response = 1
2010/02/08 23:37:26| htcpHandle: F1 = 0
2010/02/08 23:37:26| htcpHandle: RR = 1
2010/02/08 23:37:26| htcpHandle: msg_id = 68
2010/02/08 23:37:26| htcpHandle: hsz = 6
2010/02/08 23:37:26| htcpHandleTst: sz = 6
2010/02/08 23:37:26| htcpHandleTstResponse: msg_id = 68
2010/02/08 23:37:26| htcpHandleTstResponse: HIT
2010/02/08 23:37:26| htcpUnpackDetail: 0 bytes left
2010/02/08 23:37:26| htcpHandleTstResponse: key (0x81691c0)
CA1EDED1C6538099CD07E2C7607ABDA4
2010/02/08 23:37:26| ctx: enter level 0:
'http://sup.live.com/whatsnew/whatsnewservice.asmx'
2010/02/08 23:37:26| htcpBuildOpData: opcode HTCP_CLR
2010/02/08 23:37:26| htcpBuildClrOpData: RR_REQUEST
2010/02/08 23:37:26| htcpBuildCountstr: LENGTH = 3
2010/02/08 23:37:26| htcpBuildCountstr: TEXT = {GET}
2010/02/08 23:37:26| htcpBuildCountstr: LENGTH = 49
2010/02/08 23:37:26| htcpBuildCountstr: TEXT =
{http://sup.live.com/whatsnew/whatsnewservice.asmx}
2010/02/08 23:37:26| htcpBuildCountstr: LENGTH = 3
2010/02/08 23:37:26| htcpBuildCountstr: TEXT = {1/1}
FATAL: Received Segment Violation...dying.
CPU Usage: 1.813 seconds = 0.768 user + 1.045 sys
Maximum Resident Size: 459024 KB
Page faults with physical i/o: 0
Memory usage for squid via mallinfo():
total space in arena: 44132 KB
Ordinary blocks: 44024 KB 61 blks
Small blocks: 0 KB 2 blks
Holding blocks: 237028 KB 10 blks
Free Small blocks: 0 KB
Free Ordinary blocks: 107 KB
Total in use: 281052 KB 100%
Total free: 107 KB 0%
I can try to get more core dumps
Original comment by nuclear...@gmail.com
on 8 Feb 2010 at 9:50
Two hosts. 10.0.1.34 and 10.0.1.35
Key strings to reproduce:
htcp_port 4827
...
cache_peer 10.0.1.35 sibling 8080 4827 htcp proxy-only htcp-no-clr
and other one , 10.0.1.35 have
cache_peer 10.0.1.34 sibling 8080 4827 htcp proxy-only htcp-no-clr
If i remove htcp-no-clr on 10.0.1.34 , and during rebuild there is incoming http
requests - 10.0.1.34 will core dump.
Original comment by nuclear...@gmail.com
on 8 Feb 2010 at 9:52
I still can't trigger it here. Could you please recompile with ./configure
CFLAGS="-O0 -g" so gdb has a better chance of getting the debugging info? I'd
like to
see exactly what the heck it's doing and where when it crashes.
Original comment by adrian.c...@gmail.com
on 8 Feb 2010 at 10:11
#2 0x080b7580 in death (sig=11) at tools.c:326
No locals.
#3 <signal handler called>
No symbol table info available.
#4 0xf7e6d34b in ?? () from /lib/libc.so.6
No symbol table info available.
#5 0x08085bb3 in htcpBuildSpecifier (buf=0x819c26e "", buflen=8178,
stuff=0xffffd9e4) at htcp.c:293
off = 58
s = 5
#6 0x08085e2e in htcpBuildClrOpData (buf=0x819c26c "", buflen=8180,
stuff=0xffffd9e4) at htcp.c:351
reason = 256
#7 0x08085ed6 in htcpBuildOpData (buf=0x819c26c "", buflen=8180,
stuff=0xffffd9e4)
at htcp.c:371
off = 0
#8 0x08085f47 in htcpBuildData (buf=0x819c264 "", buflen=8188,
stuff=0xffffd9e4) at
htcp.c:390
off = 8
op_data_sz = 259062896
hdr_sz = 8
hdr = {length = 11824, response = 8 '\b', opcode = 14 '\016', RR = 1 '\001',
F1 = 1 '\001', reserved = 61 '=', msg_id = 1}
#9 0x08086155 in htcpBuildPacket (buf=0x819c260 "", buflen=8192,
stuff=0xffffd9e4)
at htcp.c:436
s = 0
off = 4
hdr_sz = 4
hdr = {length = 55692, major = 255 '\377', minor = 255 '\377'}
#10 0x080888ba in htcpClear (e=0x0, uri=0xf70fac8
"http://www.facebook.com/ajax/chat/settings.php", req=0xf70ee88,
method=0x8137730,
p=0x82a7870, reason=HTCP_CLR_INVALIDATION)
at htcp.c:1298
pkt = '\000' <repeats 13 times>,
"\001\000\003GET\000.http://www.facebook.com/ajax/chat/settings.php\000\003\061/
1",
'\000' <repeats 8119 times>
pktlen = 16150389
vbuf = "1/1\000\004I\036\b\350\363c\f\000\000\000\000\001http://www.\000\000\000"
stuff = {op = 4, rr = 0, f1 = 0, response = 0, reason = 1, msg_id = 1772, S =
{method = 0x810fb85 "GET",
uri = 0xfbfdab8 "http://www.facebook.com/ajax/chat/settings.php", version
= 0xffffda1c "1/1", req_hdrs = 0x2e <Address 0x2e out of bounds>, request =
0x81e4904},
D = {resp_hdrs = 0xc63f3e8 "uo\366\321%\022F\251\064c\317\217,\231\236+",
entity_hdrs = 0x3 <Address 0x3 out of bounds>,
cache_hdrs = 0xffffda48 "\210\332\377\377L\263\t\b"}}
hdr = {entries = {capacity = 0, count = -9736, items = 0x80efc63}, mask =
"\024\332\377\377\000\000\000\000\004", owner = 6, len = 0}
pa = {append = 0x8eb14c74, packer_vprintf = 0, real_handle = 0x0}
mb = {buf = 0xf70fc70 "http://www.facebook.com/ajax/chat/settings.php", size
= 0, max_capacity = 0, capacity = 0, stolen = 0}
flags = {proxying = 0, keepalive = 0, only_if_cached = 0, keepalive_broken =
0, abuse_detected = 0, request_sent = 0, front_end_https = 0, originpeer = 0,
chunked
= 0,
trailer = 0, http11 = 0}
#11 0x0809b34c in neighborsHtcpClear (e=0x0, uri=0xf70fac8
"http://www.facebook.com/ajax/chat/settings.php", req=0xf70ee88,
method=0x8137730,
reason=HTCP_CLR_INVALIDATION)
at neighbors.c:1540
p = 0x82a7870
i = 1
#12 0x080ab374 in storePurgeEntriesByUrl (req=0xf70ee88, url=0xf70fac8
"http://www.facebook.com/ajax/chat/settings.php") at store.c:436
m = 1
get_or_head_sent = 0
method = 0x8137730
e = 0x0
#13 0x0808906e in httpMaybeRemovePublic (e=0xf70ba28, reply=0xf70fbf8) at
http.c:266
status = 200
remove = 1
forbidden = 0
pe = 0x0
req = 0xf70ee88
reqUrl = 0xf70fac8 "http://www.facebook.com/ajax/chat/settings.php"
#14 0x08089d9d in httpReplySetupStuff (httpState=0xf79a4d8) at http.c:577
entry = 0xf70ba28
reply = 0xf70fbf8
#15 0x0808a2dd in httpProcessReplyHeader (httpState=0xf79a4d8, s=0) at
http.c:729
entry = 0xf70ba28
done = 387
reply = 0xf70fbf8
hdr_buf = 0x105466e0 "HTTP/1.0 200 OK\r\nCache-Control: private, no-store,
no-cache, must-revalidate, post-check=0, pre-check=0\r\nContent-Length:
96\r\nContent-Type: application/x-javascript; charset=utf-8\r\nExpires: Sat, 01
Jan"...
hdr_len = 387
hdr_size = 387
ctx = 0
#16 0x0808b2e3 in httpReadReply (fd=50, data=0xf79a4d8) at http.c:1141
reply = 0x0
httpState = 0xf79a4d8
entry = 0xf70ba28
len = 483
bin = 9
clen = 0
done = 0
already_parsed = 0
read_sz = 8192
po = 0
---Type <return> to continue, or q <return> to quit---
buffer_filled = 0
#17 0x080e5c44 in comm_call_handlers (fd=50, read_event=1, write_event=0) at
comm_generic.c:264
hdl = 0x808af11 <httpReadReply>
hdl_data = 0xf79a4d8
do_read = 1
F = 0xf32f9538
do_incoming = 1
#18 0x080e64dd in do_comm_select (msec=190) at comm_epoll.c:233
i = 31
num = 53
saved_errno = 115
#19 0x080e600d in comm_select (msec=190) at comm_generic.c:384
last_timeout = 1265669811.3832769
rc = 135156993
start = 1265669812.192826
#20 0x080e577b in iapp_runonce (msec=60000) at mainloop.c:68
loop_delay = 191
#21 0x0809566b in main (argc=2, argv=0xffffde14) at main.c:933
errcount = 0
loop_delay = 60000
Sure i can show any other variables
Original comment by nuclear...@gmail.com
on 8 Feb 2010 at 11:03
Is there anything more i can do? i can try to provide access to this hosts and
etc,
but it will be not easy, and latency to me is also terrible.
Original comment by nuclear...@gmail.com
on 9 Feb 2010 at 2:02
Can you go to that crashdump and do "frame 5" ? I'd like to see what the
function call it's making above.
Also, are you running the latest LUSCA_HEAD? I want to make sure the htcp.c
line numbers are correct.
Original comment by adrian.c...@gmail.com
on 10 Feb 2010 at 11:49
also, can you install the libc6 debug libraries? It seems you don't have symbol
info for libc?
Original comment by adrian.c...@gmail.com
on 10 Feb 2010 at 11:51
Possible it is not HEAD, but close to that. I will try to run HEAD, soon, it
will
need preliminary short testing so i will not crash production servers.
About libc6, thats a bit more difficult, but i will try to do that. It will
take time
- because it is not complete system, it is "semi-embedded".
(gdb) frame 5
#5 0x08085bb3 in htcpBuildSpecifier (buf=0x819c26e "", buflen=8178,
stuff=0xffffd9e4) at htcp.c:293
293 s = htcpBuildCountstr(buf + off, buflen - off, stuff->S.req_hdrs);
Original comment by nuclear...@gmail.com
on 10 Feb 2010 at 11:57
Ok, do this:
print buf
print off
print buflen
print stuff
print stuff->S.req_hdrs
if it is a pointer - print *stuff->S.req_hdrs
Original comment by adrian.c...@gmail.com
on 10 Feb 2010 at 12:25
(gdb) frame 5
#5 0x08085bb3 in htcpBuildSpecifier (buf=0x819c26e "", buflen=8178,
stuff=0xffffd9e4) at htcp.c:293
293 s = htcpBuildCountstr(buf + off, buflen - off, stuff->S.req_hdrs);
(gdb) print buf
$1 = 0x819c26e ""
(gdb) print off
$2 = 58
(gdb) print buflen
$3 = 8178
(gdb) print stuff
$4 = (htcpStuff *) 0xffffd9e4
(gdb) print stuff->S.req_hdrs
$5 = 0x2e <Address 0x2e out of bounds>
(gdb) print *stuff->S.req_hdrs
Cannot access memory at address 0x2e
If you want we can discuss this case in any IM (Jabber?), it will be faster.
I promiss to not bother you in any other case.
Original comment by nuclear...@gmail.com
on 10 Feb 2010 at 12:37
ok, try print *stuff .
And here is fine - I'd rather there be a public record. :)
Original comment by adrian.c...@gmail.com
on 10 Feb 2010 at 12:40
better then
(gdb) print *stuff
$6 = {op = 4, rr = 0, f1 = 0, response = 0, reason = 1, msg_id = 1772, S =
{method =
0x810fb85 "GET", uri = 0xfbfdab8
"http://www.facebook.com/ajax/chat/settings.php",
version = 0xffffda1c "1/1",
req_hdrs = 0x2e <Address 0x2e out of bounds>, request = 0x81e4904}, D =
{resp_hdrs = 0xc63f3e8 "uo\366\321%\022F\251\064c\317\217,\231\236+",
entity_hdrs =
0x3 <Address 0x3 out of bounds>,
cache_hdrs = 0xffffda48 "\210\332\377\377L\263\t\b"}}
Original comment by nuclear...@gmail.com
on 10 Feb 2010 at 12:43
req_hdrs and entity_hdrs seem .. a bit stupid. I wonder why. I wonder if it's
uninitialised memory.
Original comment by adrian.c...@gmail.com
on 13 Feb 2010 at 2:48
Damnit, it is. in htcpClear() there's no enforced initialisation of at least
S.req_hdrs.
I'll have a chat with mnot and see what the correct behaviour should be.
Original comment by adrian.c...@gmail.com
on 13 Feb 2010 at 2:52
I just committed something to LUSCA_HEAD to try and work around this. Try the
current
HEAD (at least revision 14402.)
Original comment by adrian.c...@gmail.com
on 13 Feb 2010 at 10:55
I will try to test it today.
Original comment by nuclear...@gmail.com
on 13 Feb 2010 at 1:10
Seems not crashing anymore. I will push current LUSCA_HEAD to production
servers to
see if there is any other issues.
Original comment by nuclear...@gmail.com
on 14 Feb 2010 at 7:28
That's good to hear.
Original comment by adrian.c...@gmail.com
on 14 Feb 2010 at 9:33
Original issue reported on code.google.com by
nuclear...@gmail.com
on 3 Sep 2009 at 11:16