xelerance / Openswan

Openswan
Other
856 stars 211 forks source link

2.6.52-dev1 segfault twice per day #421

Open AnatoliChe opened 4 years ago

AnatoliChe commented 4 years ago

Hi! I'm biting with sigfaults.

openswan 2.6.53-dev1 kernel: : pluto[5028]: segfault at 2f4 ip 000055baaf962800 sp 00007ffe73c97e30 error 4 in pluto[55baaf925000+a3000]

gdb OBJ.linux.x86_64/programs/pluto/pluto -c core-pluto.18393 Core was generated by /usr/local/libexec/ipsec/pluto --nofork --secretsfile /etc/ipsec.secrets --ipse. Program terminated with signal SIGSEGV, Segmentation fault.

0 setup_half_ipsec_sa (parent_st=parent_st@entry=0x0, st=st@entry=0x557936554bd0, sr=sr@entry=0x7ffd638d9620, inbound=inbound@entry=1)

at programs/pluto/kernel.c:1718

1718 const char *inbound_str = inbound ? "inbound" : "outbound";

(gdb) bt

0 setup_half_ipsec_sa (parent_st=parent_st@entry=0x0, st=st@entry=0x557936554bd0, sr=sr@entry=0x7ffd638d9620, inbound=inbound@entry=1)

at programs/pluto/kernel.c:1718

1 0x0000557935cc6f58 in install_ipsec_sa (parent_st=0x0, st=st@entry=0x557936554bd0, inbound_also=inbound_also@entry=1)

at programs/pluto/kernel.c:3020

2 0x0000557935cac4f7 in quick_inR1_outI2_cryptotail (r=r@entry=0x7ffd638d9ad0, dh=) at programs/pluto/ikev1_quick.c:2642

3 0x0000557935cac7be in quick_inR1_outI2_continue (pcrc=0x5579365bbc40, r=0x7ffd638d9ad0, ugh=0x0) at programs/pluto/ikev1_quick.c:2475

4 0x0000557935cd76af in handle_helper_comm (w=w@entry=0x55793643e3f0) at programs/pluto/pluto_crypt.c:827

5 0x0000557935cd84bb in pluto_crypto_helper_ready (readfds=readfds@entry=0x7ffd638db180) at programs/pluto/pluto_crypt.c:1101

6 0x0000557935c9a989 in call_server () at programs/pluto/server.c:798

7 0x0000557935c84faf in main (argc=, argv=) at programs/pluto/plutomain.c:1134

(gdb) frame 0

0 setup_half_ipsec_sa (parent_st=parent_st@entry=0x0, st=st@entry=0x557936554bd0, sr=sr@entry=0x7ffd638d9620, inbound=inbound@entry=1)

at programs/pluto/kernel.c:1718

1718 const char inbound_str = inbound ? "inbound" : "outbound"; (gdb) list 1713 bool outgoing_ref_set = FALSE; 1714 bool incoming_ref_set = FALSE; 1715 IPsecSAref_t refhim = st->st_refhim; 1716 IPsecSAref_t new_refhim = IPSEC_SAREF_NULL; 1717
1718 const char
inbound_str = inbound ? "inbound" : "outbound"; 1719
1720 / SPIs, saved for spigrouping or undoing, if necessary / 1721 struct kernel_sa 1722 said[EM_MAXRELSPIS],

(gdb) print inbound $1 = 1

(gdb) frame 1

1 0x0000557935cc6f58 in install_ipsec_sa (parent_st=0x0, st=st@entry=0x557936554bd0, inbound_also=inbound_also@entry=1)

at programs/pluto/kernel.c:3020

3020 if(!setup_half_ipsec_sa(parent_st, st, sr, TRUE)) { (gdb) list 3015 } 3016
3017 DBG(DBG_KLIPS, DBG_log("state #%lu: now setting up incoming SA", st->st_serialno)); 3018 / now setup inbound SA / 3019 if(st->st_ref == IPSEC_SAREF_NULL && inbound_also) { 3020 if(!setup_half_ipsec_sa(parent_st, st, sr, TRUE)) { 3021 loglog(RC_LOG_SERIOUS, "state #%lu: failed to setup incoming SA", st->st_serialno); 3022 return FALSE; 3023 } 3024 DBG(DBG_KLIPS, DBG_log("state #%lu: set up incoming SA, ref=%u/%u", st->st_serialno, st->st_ref, st->st_refhim));

(gdb) print st $2 = (struct state ) 0x557936554bd0 (gdb) print sr $3 = (struct spd_route ) 0x7ffd638d9620 (gdb) print parent_st $4 = (struct state *) 0x0

I do not fully understand how const char inbound_str = inbound ? "inbound" : "outbound"; eg const char inbound_str = 1 ? "inbound" : "outbound"; can cause segfault. Maybe I should compile with -O0?

shussain commented 4 years ago

@AnatoliChe Thank you for reportin git.

Can you please test with OSW 2.6.52rc1? If the issue still occurs, can you please post your ipsec.conf

AnatoliChe commented 4 years ago

With rc1 I have much more interesting error. I believe it's related with problems with internet connection and corrupt packets from internet. 2.6.52dev1 from local side and segfault as you can see higher. 2.6.52-rc1 from remote side and more interesting error with segfault. ipsec --version Linux Openswan U2.6.52rc1-5-g6f58e3037/K4.19.84-ipsec-core2-smp (netkey)

| next event EVENT_DPD in 0 seconds for #103 (2019-12-19 19:57:34) | *time to handle event | at 2019-12-19 19:57:34 handling event EVENT_DPD | event after this is EVENT_DPD in 0 seconds | processing connection KvPZCVt-dpt0 | find_phase1_state: found SA #105 for conn 'KvPZCVt-dpt0' in state STATE_PARENT_R2 | DPD: processing for state #103 ("KvPZCVt-dpt0") | get esp.6584de49@10.10.10.11 | inserting event EVENT_DPD, timeout in 30 seconds for #103 | event added after event EVENT_DPD for #104 | DPD: scheduling timeout to 120 | state: 105 requesting event none to be deleted by dpd_sched_timeout:201 | inserting event EVENT_DPD_TIMEOUT, timeout in 120 seconds for #105 | event added after event EVENT_PENDING_PHASE2 | DPD: sending R_U_THERE 4414 to 10.10.10.10:500 (state #105) | **emit ISAKMP Message: | initiator cookie: | b2 f4 36 2e 4a 1c c5 2b | responder cookie: | df e6 42 52 69 35 7d 29 | ISAKMP version: ISAKMP Version 1.0 (rfc2407) | exchange type: ISAKMP_XCHG_INFO | flags: ISAKMP_FLAG_ENCRYPTION | message ID: d2 49 2f 5c | ***emit ISAKMP Hash Payload: | emitting 20 zero bytes of HASH into ISAKMP Hash Payload | emitting length of ISAKMP Hash Payload: 24 | ***emit ISAKMP Notification Payload: | DOI: ISAKMP_DOI_IPSEC | protocol ID: 1 | SPI size: 16 | Notify Message Type: R_U_THERE | emitting 8 raw bytes of notify icookie into ISAKMP Notification Payload | notify icookie b2 f4 36 2e 4a 1c c5 2b | emitting 8 raw bytes of notify rcookie into ISAKMP Notification Payload | notify rcookie df e6 42 52 69 35 7d 29 | emitting 4 raw bytes of notify data into ISAKMP Notification Payload | notify data 00 00 11 3e | emitting length of ISAKMP Notification Payload: 32 | HASH computed: | 77 87 dd 08 e1 8e 0d 25 e5 5d 6b e9 87 9d e2 3e | d1 ee 23 4c | last Phase 1 IV: | current Phase 1 IV: | computed Phase 2 IV: | 88 9c 1e 70 ef b5 f1 0a 84 f0 e0 f1 a0 25 89 ea | 9d f3 fd 0e | encrypting: | 0b 00 00 18 77 87 dd 08 e1 8e 0d 25 e5 5d 6b e9 | 87 9d e2 3e d1 ee 23 4c 00 00 00 20 00 00 00 01 | 01 10 8d 28 b2 f4 36 2e 4a 1c c5 2b df e6 42 52 | 69 35 7d 29 00 00 11 3e | IV: | 88 9c 1e 70 ef b5 f1 0a 84 f0 e0 f1 a0 25 89 ea | 9d f3 fd 0e | unpadded size is: 56 | emitting 8 zero bytes of encryption padding into ISAKMP Message | encrypting 64 using 12?? pluto_crypto_helper: helper [nonnss] (0) is exiting normally Segmentation fault (core dumped) pluto_crypto_helper: helper [nonnss] (1) is exiting normally pluto_crypto_helper: helper [nonnss] (2) is exiting normally

config `conn KvPZCVt-dpt0 auto=start type=transport

direction=dpt

    ike=aes128-sha1
    esp=aes128-sha1
    right=10.10.10.10
    rightnexthop=80.78.38.201
    rightid=@dpt0.ipsec.tkb
    rightrsasigkey=0sAQN+
    left=10.10.10.11
    leftid=@KievPZCVtorm.ipsec.tkb
    leftrsasigkey=0sAQN5mECDoomDkKl+

status: ipsec auto --status | grep KvPZCVt-dpt0 000 "KvPZCVt-dpt0": 10.10.10.11[@KvPZCVt.ipsec.tkb]...80.78.38.201---10.10.10.10[@dpt0.ipsec.tkb]; erouted; eroute owner: #251 000 "KvPZCVt-dpt0": myip=unset; hisip=unset; 000 "KvPZCVt-dpt0": keys: 1:A6E8 79CF 2403 1000 2744 DCAA 4705 5588 ED7A ABCD 2:none... 000 "KvPZCVt-dpt0": ....1:A032 6524 812B F784 4EC6 1607 40AD ADD0 911D 01D8 2:none 000 "KvPZCVt-dpt0": ike_life: 3600s; ipsec_life: 1200s; rekey_margin: 300s; rekey_fuzz: 50%; keyingtries: 0; firstmsgid: 0 000 "KvPZCVt-dpt0": policy: RSASIG+ENCRYPT+PFS+UP+IKEv2ALLOW+SAREFTRACK; prio: 32,32; interface: eth0; kind=CK_PERMANENT 000 "KvPZCVt-dpt0": dpd: action:restart; delay:30; timeout:120;
000 "KvPZCVt-dpt0": newest ISAKMP SA: #240; newest IPsec SA: #251; eroute owner: #251; 000 "KvPZCVt-dpt0": IKE algorithms wanted: AES_CBC(7)_128-SHA1(2)_000-MODP1536(5), AES_CBC(7)_128-SHA1(2)_000-MODP1024(2); flags=-strict 000 "KvPZCVt-dpt0": IKE algorithms found: AES_CBC(7)_128-SHA1(2)_160-MODP1536(5)AES_CBC(7)_128-SHA1(2)_160-MODP1024(2) 000 "KvPZCVt-dpt0": IKE algorithm newest: AES_CBC_128-SHA1-MODP1536 000 "KvPZCVt-dpt0": ESP algorithms wanted: AES(12)_128-SHA1(2)_000; flags=-strict 000 "KvPZCVt-dpt0": ESP algorithms loaded: AES(12)_128-SHA1(2)_160 000 "KvPZCVt-dpt0": ESP algorithm newest: AES_128-HMAC_SHA1; pfsgroup= 000 #251: "KvPZCVt-dpt0":500 IKEv1.0 STATE_QUICK_I2 (sent QI2, IPsec SA established); EVENT_SA_REPLACE in 698s; newest IPSEC; eroute owner; isakmp#240; idle; import:admin initiate 000 #251: "KvPZCVt-dpt0" esp.484fc879@10.10.10.10 esp.4bfa6e7c@10.10.10.11 ref=0 refhim=4294901761 000 #248: "KvPZCVt-dpt0":500 IKEv1.0 STATE_QUICK_I2 (sent QI2, IPsec SA established); EVENT_SA_EXPIRE in 234s; isakmp#240; idle; import:admin initiate 000 #248: "KvPZCVt-dpt0" esp.9b75130b@10.10.10.10 esp.a48f5339@10.10.10.11 ref=0 refhim=4294901761 000 #240: "KvPZCVt-dpt0":500 IKEv1.0 STATE_MAIN_I4 (ISAKMP SA established); EVENT_SA_REPLACE in 755s; newest ISAKMP; lastdpd=24s(seq in:21924 out:0); idle; import:admin initiate `

gdb pluto -c core-pluto.16268.kvpzv.1576778254 GNU gdb (Debian 8.2.1-2+b3) 8.2.1 Core was generated by/usr/local/libexec/ipsec/pluto --nofork --secretsfile /etc/ipsec.secrets --ipse'. Program terminated with signal SIGSEGV, Segmentation fault.

0 aes_set_key (cx=cx@entry=0x7ffd81c5f5b0, in_key=0x0, n_bytes=0, f=f@entry=0) at lib/libcrypto/libaes/aes.c:987

987 cx->aes_e_key[0] = const_word_in(in_key ); `

(gdb) bt `#0 aes_set_key (cx=cx@entry=0x7ffd81c5f5b0, in_key=0x0, n_bytes=0, f=f@entry=0) at lib/libcrypto/libaes/aes.c:987

1 0x000055bf86c5e957 in do_aes (buf=0x55bf86d038bc <reply_buffer+28> "\v", buf_len=64, key=, key_size=,

iv=0x55bf889dcdf8 "\210\234\036p\357\265\361\n\204\360\340\361\240%\211\352\235\363\375\016", enc=1) at programs/pluto/ike_alg_aes.c:103

2 0x000055bf86c1e0c5 in encrypt_message (pbs=pbs@entry=0x7ffd81c5f870, st=st@entry=0x55bf889dc380) at programs/pluto/ikev1_main.c:531

3 0x000055bf86c21950 in send_isakmp_notification (st=st@entry=0x55bf889dc380, type=type@entry=36136, data=data@entry=0x7ffd81c5fc6c, len=len@entry=4)

at programs/pluto/ikev1_main.c:2396

4 0x000055bf86c08dd0 in dpd_outI (p1st=0x55bf889dc380, st=0x55bf889d64e0, eroute_care=, delay=30, timeout=)

at programs/pluto/dpd.c:329

5 0x000055bf86c14760 in handle_a_timer_event (ev=0x55bf889d28d0) at programs/pluto/timer.c:537

6 0x000055bf86c14faa in handle_timer_event () at programs/pluto/timer.c:442

7 0x000055bf86c139d9 in call_server () at programs/pluto/server.c:810

8 0x000055bf86bfdfaf in main (argc=, argv=) at programs/pluto/plutomain.c:1134

`

(gdb) frame 0 `

0 aes_set_key (cx=cx@entry=0x7ffd81c5f5b0, in_key=0x0, n_bytes=0, f=f@entry=0) at lib/libcrypto/libaes/aes.c:987

987 cx->aes_e_key[0] = const_word_in(in_key ); (gdb) list 982 break; 983 } 984
985 cx->aes_Nrnd = (cx->aes_Nkey > nc ? cx->aes_Nkey : nc) + 6; 986
987 cx->aes_e_key[0] = const_word_in(in_key ); 988 cx->aes_e_key[1] = const_word_in(in_key + 4); 989 cx->aes_e_key[2] = const_word_in(in_key + 8); 990 cx->aes_e_key[3] = const_word_in(in_key + 12); `

(gdb) print in_key $1 = (const unsigned char *) 0x0 (gdb) print cx->aes_Nrnd $2 = 10 (gdb) print cx->aes_e_key[0] $3 = 2261782116 (gdb) print cx->aes_Nkey $4 = 4 (gdb) print nc No symbol "nc" in current context.

AnatoliChe commented 4 years ago

disass /m 987 cx->aes_e_key[0] = const_word_in(in_key ); 0x000055bf86c7101a <+74>: mov (%rsi),%edx 0x000055bf86c7101c <+76>: lea 0x56fdd(%rip),%r14 # 0x55bf86cc8000 <rcon_tab> 0x000055bf86c71023 <+83>: lea 0x53fd6(%rip),%rax # 0x55bf86cc5000 <fl_tab> 0x000055bf86c7102a <+90>: mov %edx,0x8(%rdi) => 0x000055bf86c712a9 <+729>: mov (%rsi),%edx 0x000055bf86c712ab <+731>: lea 0x56d4e(%rip),%rbp # 0x55bf86cc8000 <rcon_tab> 0x000055bf86c712b2 <+738>: lea 0x53d47(%rip),%rax # 0x55bf86cc5000 <fl_tab> 0x000055bf86c712b9 <+745>: mov %edx,0x8(%rdi) 0x000055bf86c7135b <+907>: mov (%rsi),%edx 0x000055bf86c7135d <+909>: lea 0x56c9c(%rip),%rbp # 0x55bf86cc8000 <rcon_tab> 0x000055bf86c71364 <+916>: mov %edx,0x8(%rdi)

We have `#ifdef AES_LE_OK / little endian processor without data alignment restrictions /

define word_in(x) (u_int32_t)(x)

define const_word_in(x) (const u_int32_t)(x)

define word_out(x,v) (u_int32_t)(x) = (v)

define const_word_out(x,v) (const u_int32_t)(x) = (v)

else

/ slower but generic big endian or with data alignment restrictions / / some additional "const" touches to stop "gcc -Wcast-qual" complains --jjo /

define word_in(x) ((u_int32_t)(((unsigned char )(x))[0])|((u_int32_t)(((unsigned char )(x))[1])<<8)|((u_int32_t)(((unsigned char )(x))[2])<<16)|((u_int32_t)(((unsigned char )(x))[3

define const_word_in(x) ((const u_int32_t)(((const unsigned char )(x))[0])|((const u_int32_t)(((const unsigned char )(x))[1])<<8)|((const u_int32_t)(((const unsigned char *)(x))[2])<

define word_out(x,v) ((unsigned char )(x))[0]=(v),((unsigned char )(x))[1]=((v)>>8),((unsigned char )(x))[2]=((v)>>16),((unsigned char )(x))[3]=((v)>>24)

define const_word_out(x,v) ((const unsigned char )(x))[0]=(v),((const unsigned char )(x))[1]=((v)>>8),((const unsigned char )(x))[2]=((v)>>16),((const unsigned char )(x))[3]=((v)>>24)

endif

`

This is like to some kind of magic :-D

AnatoliChe commented 4 years ago

I switched ipsec from aes128-sha1 to aes_ctr160-sha1 and do not have segfault already 3 days.

letoams commented 4 years ago

Do you compile with nss? Nss is an active maintained and certified crypto library, unlike the mostly abandoned openswan native crypto.

Sent from my iPhone

On Dec 23, 2019, at 02:36, Anatoli notifications@github.com wrote:

I switched ipsec from aes128-sha1 to aes_ctr160-sha1 and do not have segfault already 3 days.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

AnatoliChe commented 4 years ago

No it was without nss. I understand nss is future. But it's interesting, why in some cases of network problems in_key = (const unsigned char *) 0x0 Why const_word_in(0) is segfault I understand :-)

letoams commented 4 years ago

On Mon, 23 Dec 2019, Anatoli wrote:

No it was without nss. I understand nss is future. But it's interesting, why in some cases of network problems in_key = (const unsigned char *) 0x0

I don't know. That code hasn't been maintained since the libreswan fork in 2011. libreswan at the time removed all non-nss crypto code, so I have no more experience with the openswan ancient crypto code.

AnatoliChe commented 4 years ago

libreswan at the time removed all non-nss crypto code

So, it's time clean up LSW from: aes aes.c des and some other?

letoams commented 4 years ago

On Mon, 23 Dec 2019, Anatoli wrote:

  libreswan at the time removed all non-nss crypto code

So, it's time clean up LSW from aes aes.c

Those are part of the KLIPS kernel module, which is scheduled for removal in 3.31 (after the current code is released as 3.30)

These files are not used for the userland IKE daemon.

Paul