Tencent / mars

Mars is a cross-platform network component developed by WeChat.
Other
17.29k stars 3.63k forks source link

Mars长链接突然链接不上了,alarm start之后,就没有其他运行 #1200

Open marchlqq opened 10 months ago

marchlqq commented 10 months ago

现象:用户在14点07-08使用长链接操作,但是都发送失败。 查询用户日志,用户在14点01分,长链接断开,后面就没有恢复成功。 请问一下:什么原因造成呢?

概率:非必现。 最近操作:编译 targetSdk 升级到 33。 手机型号:Android 13

[E][2023-12-14 +8.0 14:01:27.450][28431, 37528][yzj-im][, , 0][write = 2023-12-14 14:01:27.450 yzj-im 长连接恢复,尝试发起一次query newPush cmd [D][2023-12-14 +8.0 14:01:27.450][28431, 28662][mars::stn][com_tencent_mars_stn_StnLogic_C2Java.cc, operator, 417][all_connstatus = 4, longlink_connstatus = 4 [I][2023-12-14 +8.0 14:01:27.451][28431, 28662][mars::stn][net_core.cc, operator, 322][task start long short taskid:708, cmdid:2, need_authed:true, cgi:, channel_select:2, limit_flow:false, host:, send_only:false, cmdid:2, server_process_cost:-1, retrycount:0, channel_strategy:0, total_timetout:12000, network_status_sensitive:false, priority:3, report_arg: [D][2023-12-14 +8.0 14:01:27.451][28431, 28662][mars::stn][longlink_task_manager.cc, StartTask, 83][taskid=708 [E][2023-12-14 +8.0 14:01:27.451][28431, 37528][yzj-im][, , 0][write = 2023-12-14 14:01:27.451 yzj-im Sending mars task within IM process, taskId = 708, req2buf={"cmd":"newPush","type":"query","deviceId":"OAID2a74ca6be6e94cbbba160da48c3056c3"} [I][2023-12-14 +8.0 14:01:27.451][28431, 28662][mars::][longlink_packer.cc, operator, 556][longlink_pack, sessionid:170575536723323266, cmdid:2, body_length:96, org_length:82 [I][2023-12-14 +8.0 14:01:27.451][28431, 28662][mars::stn][longlink_task_manager.cc, __RunOnStartTask, 382][task add into longlink readwrite suc cgi:, cmdid:2, taskid:708, size:82, timeout(firstpkg:15, rw:31, task:12), retry:0

[D][2023-12-14 +8.0 14:01:27.482][28431, 20727][mars::comm][platform_comm.cc, startAlarm, 58][id= 2475, after= 270000, ret= true [I][2023-12-14 +8.0 14:01:27.482][28431, 20727][mars::comm][alarm.cc, Start, 67][alarm id:541425896392, after:270000, seq:2475, po.reg.q:540795706544,po.reg.s:0,po.s:16609 [I][2023-12-14 +8.0 14:01:27.483][28431, 20727][mars::stn][longlink.cc, __RunReadWrite, 597][start noop taskid:4294967295, cmdid:6, last:(0,0), next:270000 [D][2023-12-14 +8.0 14:01:27.484][28431, 20727][mars::comm][platform_comm.cc, stopAlarm, 72][id= 2475, ret= true

[I][2023-12-14 +8.0 14:01:27.484][28431, 20727][mars::comm][alarm.cc, Cancel, 90][alarm cancel id:541425896392, seq:2475, after:270000

[D][2023-12-14 +8.0 14:01:27.492][28431, 20727][mars::comm][platform_comm.cc, startAlarm, 58][id= 2476, after= 270000, ret= true [I][2023-12-14 +8.0 14:01:27.493][28431, 20727][mars::comm][alarm.cc, Start, 67][alarm id:541425896392, after:270000, seq:2476, po.reg.q:540795706544,po.reg.s:0,po.s:16610 [I][2023-12-14 +8.0 14:01:27.493][28431, 20727][mars::stn][longlink.cc, RunReadWrite, 675][task socket send sock:84, all send:140, count:2, sub send taskid:4294967295, cmdid:0, , len(S:22, 22/22), sub send taskid:708, cmdid:2, , len(S:118, 118/118), [D][2023-12-14 +8.0 14:01:27.494][28431, 28662][mars::stn][longlink_task_manager.cc, OnSend, 611][taskid:708, starttime:2417753 [I][2023-12-14 +8.0 14:01:27.521][28431, 20727][mars::stn][longlink.cc, RunReadWrite, 776][task socket recv sock:84, recv len:44, buff len:44 [I][2023-12-14 +8.0 14:01:27.522][28431, 20727][mars::stn][longlink.cc, RunReadWrite, 807][task socket recv sock:84, pack recv finish taskid:4294967295, cmdid:6, , packlen:(22/22) [I][2023-12-14 +8.0 14:01:27.522][28431, 20727][mars::stn][longlink.cc, __NoopResp, 323][end noop [D][2023-12-14 +8.0 14:01:27.523][28431, 20727][mars::comm][platform_comm.cc, stopAlarm, 72][id= 2474, ret= true

[I][2023-12-14 +8.0 14:01:27.523][28431, 20727][mars::comm][alarm.cc, Cancel, 90][alarm cancel id:541425896552, seq:2474, after:5000 [D][2023-12-14 +8.0 14:01:27.523][28431, 20727][mars::stn][smart_heartbeat.cc, OnHeartResult, 106][heart result:true, false [I][2023-12-14 +8.0 14:01:27.523][28431, 20727][mars::stn][longlink.cc, RunReadWrite, 807][task socket recv sock:84, pack recv finish taskid:708, cmdid:2, , packlen:(22/22) [D][2023-12-14 +8.0 14:01:27.524][28431, 28662][mars::stn][com_tencent_mars_stn_StnLogic_C2Java.cc, operator, 254][the decodeBuffer.Lenght() <= 0 [D][2023-12-14 +8.0 14:01:27.524][28431, 28662][mars::stn][dynamic_timeout.cc, StatusSwitch, 185][task_status:1, good_count:97, good_time:0, dyntime_status:1, dyntime_failed_normal_count_NORMAL:10, cgi: [I][2023-12-14 +8.0 14:01:27.524][28431, 28662][mars::stn][longlink_task_manager.cc, __SingleRespHandle, 414][task end callback long cmdid:2, err(0, 0, 0), svr(157.255.49.239:20080, NewDNSIP, mtp.oppowork.com), cli(, 10.0.110.251, n:46006, sig:0), cost(s:82, r:0, c:38, rw:30), all:73, retry:0, cgi:, taskid:708, tid:20727 [D][2023-12-14 +8.0 14:01:27.524][28431, 28662][mars::stn][net_check_logic.cc, UpdateLongLinkInfo, 118][shortlink:_continueFailCount=0, _isTaskSucc=1, records=0xaaaaaaa9 [I][2023-12-14 +8.0 14:01:27.524][28431, 28662][mars::stn][zombie_task_manager.cc, RedoTasks, 117][-> RedoTasks [I][2023-12-14 +8.0 14:01:27.524][28431, 28662][mars::stn][zombie_task_manager.cc, RedoTasks, 117][<- RedoTasks +0, [D][2023-12-14 +8.0 14:01:27.524][28431, 28662][mars::stn][net_source.cc, ReportLongIP, 239][_is_success=true, ip=157.255.49.239, port=20080 [E][2023-12-14 +8.0 14:01:33.166][28431, 20727][mars::stn][longlink.cc, __RunReadWrite, 589][task socket close sock:84, excptoin:103(Software caused connection abort), nread:0, nwrite:0, net_type:46006task socket close getsocktcpinfo:tcpi_state=0x7, tcpi_ca_state=0x0, tcpi_retransmits=0x0, tcpi_probes=0x0, tcpi_backoff=0x0, tcpi_options=0x6, tcpi_rto=0x3A980, tcpi_snd_mss=0x550, tcpi_rcv_mss=0x218, tcpi_unacked=0x0, tcpi_sacked=0x0, tcpi_lost=0x0, tcpi_retrans=0x0, tcpi_fackets=0x0, tcpi_last_data_sent=0x1634, tcpi_last_ack_sent=0x0, tcpi_last_data_recv=0x1610, tcpi_last_ack_recv=0x1614, tcpi_pmtu=0x578, tcpi_rcv_ssthresh=0xFFFF, tcpi_rtt=0x8D07, tcpi_rttvar=0x3881, tcpi_snd_ssthresh=0x578, tcpi_snd_cwnd=0xA, tcpi_advmss=0x550, tcpi_reordering=0x3, tcpi_rcv_rtt=0x0, tcpi_rcv_space=0x3520, tcpi_total_retrans=0x [D][2023-12-14 +8.0 14:01:33.177][28431, 20727][mars::comm][platform_comm.cc, stopAlarm, 72][id= 2476, ret= true

[I][2023-12-14 +8.0 14:01:33.177][28431, 20727][mars::comm][alarm.cc, Cancel, 90][alarm cancel id:541425896392, seq:2476, after:270000 [I][2023-12-14 +8.0 14:01:33.179][28431, 20727][mars::stn][longlink.cc, __ConnectStatus, 355][connect status from:2 to:3, nettype:2

[I][2023-12-14 +8.0 14:01:33.179][28431, 20727][mars::stn][smart_heartbeat.cc, OnLongLinkDisconnect, 81][-> OnLongLinkDisconnect [I][2023-12-14 +8.0 14:01:33.179][28431, 20727][mars::stn][smart_heartbeat.cc, OnLongLinkDisconnect, 94][46006 not stable last heart:270000 [I][2023-12-14 +8.0 14:01:33.179][28431, 20727][mars::stn][smart_heartbeat.cc, OnLongLinkDisconnect, 81][<- OnLongLinkDisconnect +0, [I][2023-12-14 +8.0 14:01:33.181][28431, 20727][mars::stn][longlink.cc, Run, 384][<- Run +5894,

[D][2023-12-14 +8.0 14:01:33.184][28431, 28662][mars::comm][platform_comm.cc, startAlarm, 58][id= 2477, after= 500, ret= true [I][2023-12-14 +8.0 14:01:33.184][28431, 28662][mars::comm][alarm.cc, Start, 67][alarm id:12970367473015327544, after:500, seq:2477, po.reg.q:540795706544,po.reg.s:0,po.s:16619 [D][2023-12-14 +8.0 14:01:33.186][28431, 28662][mars::comm][platform_comm.cc, startAlarm, 58][id= 2478, after= 600000, ret= true [I][2023-12-14 +8.0 14:01:33.187][28431, 28662][mars::comm][alarm.cc, Start, 67][alarm id:12970367467874194432, after:600000, seq:2478, po.reg.q:540795706544,po.reg.s:0,po.s:16620 [W][2023-12-14 +8.0 14:01:33.187][28431, 28662][mars::stn][longlink_task_manager.cc, __OnResponse, 532][task error, taskid:0, cmdid:0, error_type:4, error_code:103 [D][2023-12-14 +8.0 14:01:33.187][28431, 28662][mars::stn][net_check_logic.cc, UpdateLongLinkInfo, 118][shortlink:_continueFailCount=0, _isTaskSucc=0, records=0x55555552 [D][2023-12-14 +8.0 14:01:33.187][28431, 28662][mars::stn][net_source.cc, ReportLongIP, 239][_is_success=false, ip=157.255.49.239, port=20080 [I][2023-12-14 +8.0 14:01:33.680][28431, 28662][mars::comm][alarm.cc, OnAlarm, 131][OnAlarm id:12970367473015327544, seq:2477, elapsed:500, after:500, miss:0, android alarm:false, runing [I][2023-12-14 +8.0 14:01:33.680][28431, 28662][mars::stn][longlink_connect_monitor.cc, __AutoIntervalConnect, 203][start auto connect after:187611

[D][2023-12-14 +8.0 14:01:33.685][28431, 28662][mars::comm][platform_comm.cc, startAlarm, 58][id= 2479, after= 187611, ret= true [I][2023-12-14 +8.0 14:01:33.685][28431, 28662][mars::comm][alarm.cc, Start, 67][alarm id:12970367473015327544, after:187611, seq:2479, po.reg.q:540795706544,po.reg.s:0,po.s:16623

[I][2023-12-14 +8.0 14:01:36.603][28431, 2][TeamTalk][, , 0][onTrimMemory: 5 [I][2023-12-14 +8.0 14:01:36.621][28431, 2][TeamTalk][, , 0][onTrimMemory: totalMem24298254336, available Mem8383197184,threshold:679477248,isLowMemory:false [I][2023-12-14 +8.0 14:01:38.198][28431, 2][MicroMsg.Alarm][, , 0][onReceive id=2477, curId=2477 [I][2023-12-14 +8.0 14:01:38.198][28431, 2][MicroMsg.Alarm][, , 0][onReceive find alarm id:2477, pid:28431, delta miss time:4518 [D][2023-12-14 +8.0 14:01:38.203][28431, 28431*][mars::comm][OnAlarm.inl, Java_com_tencent_mars_comm_Alarm_onAlarm, 13][BroadcastMessage seq:2477

marchlqq commented 10 months ago

查询了一下, platform_comm.cc, startAlarm, 58][id= 2477, after= 500, ret= true platform_comm.cc, startAlarm, 58][id= 2478, after= 600000, ret= true onReceive id=2477, curId=2477 这里卡在了,alarm 定时任务运行,没有正常运行。 am.setExact(AlarmManager.ELAPSED_REALTIME_WAKEUP, time, pendingIntent);

用户在前台,也没有运行 alarm,是因为600s,10分钟的时间设定?还是因为 alarm存在问题,没有正常运行呢。 为什么会出现600s?

marchlqq commented 10 months ago

mars 进程,在14:01就停止了,app主进程的代码还在运行,所以现在在想,是mars进程,被回收了。就很奇怪。